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ABSTRACT 



The goal of this study is to provide a single source of data that enables the 
selection of an appropriate voice recognition (VR) application for a decision 
support system (DSS) as well as for other computer applications. A brief 
background of both voice recognition systems and decision support systems is 
provided with special emphasis given to the dialog component of DSS. The 
categories of voice recognition discussed are human factors, environmental 
factors, situational factors, quantitative factors, training factors, host computer 
factors, and experiments and research. Each of these areas of voice recognition is 
individually analyzed, and specific references to applicable literature are included. 

This study also includes appendices that contain: 

• A glossary (including definitions) of phrases specific to both decision support 
system and voice recognition systems. 

• Keywords applicable to this study. 

• An annotated bibliography (alphabetically and by specific topics) of current 
VR systems literature containing over 200 references. 

• An index of publishers. 

• A complete listing of current commercially available VR systems. 
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I. INTRODUCTION 



A. BACKGROUND 

The rapid influx of powerful microcomputers has provided both the incentive 
and capability to enhance the productivity of humans. These powerful and 
inexpensive workhorses are being exploited for automating routine tasks, acquiring 
and communicating information, and the intelligent support of decision making. 
Of major importance is the effort to enhance the productivity of humans who 
control these machines through the use of human-computer interfaces that both 
maximize human performance and take advantage of the growing capabilities of 
these computer systems. 

It is estimated that, for over 95 percent of human-computer interactions, 
people costs are greater than the machine costs [Infotech 79]. Actions that reduce 
the human cost and simplify the human interface will have great impact on the 
computer industry. A technology must explore these interfaces in order to grow 
and develop to its full potential. 

Many forms of man-machine interfaces have been developed, including 
cathode ray tube displays, printers, keyboards, joysticks, etc. However, speech is 
recognized to the most natural and fastest form of human communication, and 
should be considered as an interface technique for system optimization. [LeFever 
87] 

Research into voice recognition (VR) systems has been ongoing for over 30 
years. Research into decision support systems (DSS), which evolved from 
management information systems over 15 years ago, now is maturing. The two 



1 



technologies, which until now have matured separately, are logical candidates for 
merging. Thus the focus of this study is the application of voice recognition 
systems to decision support systems. A Glossary of Terms used in this study is 
provided in Appendix A. 

B. VOICE RECOGNITION SYSTEMS 

Voice recognition is defined as the ability of a computer or other device to 
recognize spoken words correctly and to translate them into a predetermined 
output string to the computer [LeFever 87]. Voice recognition is also called 
automatic speech recognition and by other names, as listed in Appendix B. It is 
important to note that the term voice recognition refers to and concerns only 
command input via the human voice. It does not include computerized voice output 
or speech synthesis. 

There are many advantages to using voice input to computer systems. In 

general, a voice recognition system: 

• is more accurate than conventional forms of input 

• allows for concurrent use of hands, eyes, and other senses 

• allows freedom of movement from a specified location 

• can be used in low light or dark areas 

• is faster than conventional forms of input 

• promotes the use of the computer system or application that it is used 
in conjunction with 

• is easy to learn and easy to use 

• promotes productivity 

• works better in multilingual environments than conventional input 

• works equally well for individuals ranging from novice typists 
through expert typists 

• works well for many handicapped individuals [Poock 80, Poock 81, 
Armstrong 80, Baker 84, LeFever 87] 
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Dobney classifies voice recognition as "a fifth generation language or more 
concisely a fifth generation concept." [Dobney 87] Voice recognition, along with 
other fifth generation concepts, is expected to be critical for the future for all 
computer applications. 

C. DECISION SUPPORT SYSTEMS 

There is no generally recognized single definition of decision support systems. 
The definitions in use cover a broad spectrum of what is and is not a DSS [Keen 87], 

For this study, the following definition will be used: 

The application of available and suitable computer-based technology to 
help improve the effectiveness of managed decision making in semi- 
structured tasks. [Keen 87] 

The key aspects of DSS include: 

• They are computer based systems. 

• They are used by decision makers. 

• They help decision makers confront ill-structured problems. 

• They work through direct interaction. 

• They utilize data analysis models. [Sprague 82] 

This study will focus on the fourth aspect, direct interaction between the decision 
maker and the computer system. 

The basic DSS has three components: data, dialog, and models [Sprague 82], 
These are referred to as the DDM paradigm of a DSS and the relationships are 
illustrated in Figure 1.1. The importance of the dialog component cannot be over- 
emphasized, since all the capabilities of the DSS must be articulated and 
implemented through it. 
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(Dialog-Data-Model) DDM Paradigm 




Figure 1.1. The Dialog, Data, Model Components of the DSS 

Framework [Sprague 82] 



This dialog component consists of three subcomponents, as illustrated in Figure 



1 . 2 . 

• The action language is what the user can do in communicating with 
the system. 

• The presentation or display language is what the user sees. 

• The knowledge base is what the user must know in order to operate 
the system. This can take the form of help menus, reference cards or 
instructions, a user's manual or information that previously has been 
learned. 
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Figure 1.2. The Dialog System User Interface [Sprague 80] 



This study primarily considers the action language of DSSs and its 
implementation through the use of voice input. Secondary consideration is given to 
minimizing the size of the knowledge base through the use of a natural language 
interface and by optimizing the presentation language so that it will naturally 
encourage and prompt proper input. 

No single all-encompassing or overall best dialog mode presently exits. That 
is, no system has the ability to handle a variety of human interaction styles, shifting 
between styles at the user’s request. Regardless of a user's experience with 
computers or the problem or tasks, the specific dialog mode of a given system must 
be learned and used, in order to use the system. This is true even if the user is 
already familiar with another dialog mode for another system. 

As noted by Sprague, "Dialog will profit significantly from the inclusion of 
natural language processing techniques and voice recognition." [Sprague 87] 
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D. GOALS AND OBJECTIVES 

The primary objective of this study is to provide a current, concise, condensed, 
and summarized single source of data that will enable selection of an appropriate 
voice recognition application for a given decision support system. In essence this is 
a non-automated aid for making voice recognition system decisions related to the 
design of an automated DSS. 

A secondary objective is to provide users, developers, researchers, and all 
others concerned with voice recognition input with a current reference guide to 
voice recognition research. Keywords used in locating references are provided in 
Appendix B. This guide is included in Appendix C, an annotated bibliography of 
current VR literature, with subappendices that contain references to the annotated 
bibliography by functional areas of DSS. Appendix D furnishes the publishing 
source of all literature contained in the annotated bibliography and thus facilitates 
retrieval of hard-to-find articles. 

A third objective of is to provide a current listing of all available voice 
recognition systems commercially available. This list is contained in Appendix E, 
along with information concerning compatibility with current computer systems 
for these voice systems. The voice recognition systems listed include a wide range 
of capabilities, and are useable on systems varying in size from mainframe 
computers to desk top microcomputers. 

The overall goal of this study is to supply a useful guide for decisions 
concerning the implementation or use of voice input for decision support systems as 
well as for other computer applications. 
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E. SCOPE AND METHODOLOGY 

1. Scope 

This study primarily considers only current voice recognition literature, 
that is, books, articles, and reports that are less than five years old (published after 
1 January 1983). A limited amount of older literature, determined especially 
pertinent and worthy of note, also is included. 

Keywords used in searching the literature are listed in Appendix B. 
Words representing voice and speech-related topics not included in this study also 
are listed there. No experiments or case studies were conducted for this thesis. 

2. Research Methodology 

Exhaustive research was conducted to identify all current and accessible 
voice recognition literature and voice recognition systems. This research was 
conducted using Naval Postgraduate School and University of California, Santa 
Cruz, resources and via locally accessible computer networks. 

The universe of papers from which the database was drawn consists of all 
literature that contains keywords listed in Appendix B. Initially over 1000 
references were located. These items were reviewed and filtered to determine 
those applicable to DSSs. As a result of a review process, over 230 articles were 
classified as applicable to DSSs and are included in the final database in the form of 
an annotated bibliography. In many cases this bibliography also contains excerpts, 
abstracts, or summaries of those articles related to voice recognition that are 
considered to be useful for users, developers, researchers, and others concerned 
with voice input to decision support systems. 
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II. DATA ANALYSIS 



A. BACKGROUND 

As fifth generation computer technology approaches, the use of "intelligent 
systems" will give increasing flexibility to the input devices of the future. The data 
collected for this study provides knowledge needed to pick the best method of 
human-computer interaction for the specific environments of a given DSS. 

It has been proposed that speech is the human's highest capacity and most 
natural form of communications [Lombardo 84]. Therefore computer voice 
recognition would be the most natural way for humans to interface with machines. 
The problem preventing the widespread acceptance of VR seems to be that most 
people are simply not aware that VR exists or what it can really do for them. 

This chapter discusses various research areas or categories of both voice 
recognition systems and DSSs. Data are placed into several categories in order to 
facilitate locating answers to specific problems and to aid in performing research 
related to a specific DSS application or environment. These categories were 
arrived at through an empirical process of reviewing the reports and noting logical 
trends in the literature. Each research area is related to an Appendix in this report 
containing references to articles germane to that area. 

B. HUMAN FACTORS 

Categories of human factors included in this study are (1) stress, (2) 
multimodality, (3) user speaking experience level, (4) computer experience level, 
and (5) the size of the vocabulary. These topics are related to several human 
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factors areas: occupational, operational, psychological, physiological, and 
personal. [Yellen 83] 

Human factors is discussed first because of its importance. No matter how fast 
the computer is, how efficient its speech recognition algorithm is, or how pretty its 
displays are, it will not be used effectively or efficiently unless human factors 
knowledge applicable to system implementation has been reviewed and 
incorporated. 

Appendix Cl, Section 1, contains a listing of material applicable to the area of 
human factors. Sections 2 thru 6 of that Appendix include references that are 
specific for each category within the scope of human factors. 

1. Stress Related Factors 

Stress influences the sound wave frequency of an individual's speaking 
voice. Additionally, stressed speakers often appear to talk in longer bursts, with 
shorter pauses separating the bursts. Psychological stress also influences an 
individual's vocal production in other ways. However, there is no consensus in the 
literature concerning how stress can be analyzed to predict an outcome. 

Stress may be either physiological, psychological, or a combination of 
both. Physiological stress is more clear cut than psychological, and refers to the 
result of human stresses such as heat, pressure, electric shock, and similar stimuli. 
Psychological stress comes from many sources and relates to an individual's ability 
to cope, adapt, or react to an unfamiliar, unfriendly, or threatening environment, 
or to the influence of that environment on the individual. 

Psychological stress can be further subdivided into situational and self- 
induced stress. Situational stress is the influence of unfavorable environmental 
factors (excluding physical factors) on an individual. These factors are beyond the 
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individual's control and may include circumstances such as public speaking, 
deadlines, quotas, etc. Self-induced stress is the self-imposition of a condition or 
stimulus. These include self-imposed goals, deadlines, or performance 
requirements of any type with which an individual forces himself to function above 
a "comfortable" or "easy" level [French 83]. It is important to remember that in 
some cases it may not be possible to separate physical from psychological stimuli. 

Research in the area of stress and voice recognition was found to be 
limited. References are listed in Section 2 of Appendix Cl . 

2. Multimodal Factors 

Voice recognition systems are unique in their ability to free the user's 
mind and eyes for carrying out visual tasks. A voice recognition system permits 
the user to view graphics, screens, and decision aids, to oversee personnel, or to 
read from a data source without having to remove the eyes in order to communicate 
with the computer. 

Baker states in her keynote address to the First International Conference 
on Speech Technology: 

Just as Darwin hypothesized that people developed spoken rather than 
gestural language so as to free up their hands and be able to communicate 
in the dark or out of sight, so speech recognition has seen its initial 
applications in "hands busy, eyes busy" applications. [Baker 84] 

Voice recognition systems promise freedom from the distraction of 
interrupting the flow of work to recall codes and find keys. Voice recognition can 
free the operator from having to remain close to a specific physical installation, 
such as a video display terminal or keyboard. Additionally, the use of a wireless 



microphone permits extensive mobility while talking to computers. French states 
that 

Voice-input could enable the operator to continue the task at the 
terminal, and simultaneously manipulate a visual representation of the 
problem they are involved in, for others’ benefit. This is a potential boom 
in the period of transition from a symbolic gestalt to an era of much more 
wide spread computer literacy. [French 83] 

As cited by Yellen, with this increased mobility also come increased 
problems; breath noise can now create a serious problem [Yellen 83], An 
individual who is involved in little or no physical movement while engaged in voice 
recognition can obtain very high recognition accuracy, but errors may be induced 
once the user begins to move. When using a close-talking, noise-concealing 
microphone, inhaling does not appear to cause problems; however, exhaling will 
produce signal levels comparable to speech levels. 

The advantage of having ones hands, eyes, and mind free to perform other 
tasks could be the major contributing factor in the choice of voice recognition input 
to a computer application. This multimodal aspect of voice recognition enhances or 
compliments traditional tactile input methods rather than replacing them in total. A 
listing of literature related to the multimodal aspects of voice recognition is 
contained in Section 3 of Appendix Cl. 

3. Speaker's Experience Level 

Many studies have been done measuring the speaker’s experience with 
voice recognition systems and the resulting quality of the output or task 
performance. The research in this area is referenced in Section 4 of Appendix C2. 

Most studies generally agree that, regardless of the initial experience level 
of a speaker, novices quickly pick up voice recognition systems skills and that their 



performance improves rapidly toward levels of experienced users. It is important 
to note that professional typing skills require a long learning period and diminish 
quickly with disuse. On the other hand, speaking is a natural output mode for the 
human and is practiced everyday by all. The user has only to restrict spoken 
utterances to those which the machines can recognize. 

4. Computer Experience Level 

It is a credit to the adaptability of humans that they can use today's 
software when so much of it still abounds with such non-memorable commands. 
Complex multiple command/control/shift keystrokes often are required which can 
only be recalled by constant and experienced users. Commands that require precise 
syntax, spacing, and order can be simplified by the use of voice commands. Once 
the utterance is recognized by the computer it is input correctly. Long commands 
or passwords which require accurate input and multiple keystrokes are easily 
mistyped, but can be input accurately with a voice recognition system. 

The video display can provide directions for the next voice input through 
the use of menus or with a graphical representation. This can be of special value to 
both DSSs and Group DSSs, enabling rapid generation of "what if' brain storming 
or alternatives generation. 

Section 5 of Appendix Cl provides a guide to publications that deal with a 
user's computer experience level. Many techniques are listed in these articles 
which enable better performance, given a specific experience level. 

5. Vocabulary Factors 

The vocabulary selected for a voice recognition system affects the speed 
and accuracy of the system in many ways. The selection and structure of the 
vocabulary is extremely important to the success of the system. The vocabulary 



should be as natural as possible, while avoiding conflicting, confusing, or similar 
sounding utterances. 

Most current voice recognition systems perform well with small 
vocabularies. When the size of these vocabularies gets large (greater than 1000 
utterances) the probability of error increases, along with the processing time. The 
possibility of confusion between words increases with vocabulary size also, as does 
the probability that similar sounding words have been included. Better speech 
recognition systems usually have recognition algorithms designed to reject rather 
than guess at questionable or similar words. 

Humans have a low tolerance level for waiting for machines and for 
machines that make errors; studies show that humans tend to abandon systems that 
perform in this manner. With very large vocabulary sets, the amount of data to be 
processed for each recognition is intolerably large unless coding is optimal and 
optimized comparisons are used. Accuracy is increased and recognition time 
decreased by using vocabulary subsets. A given subset usually is entered by saying 
the subset's name or title (also called the node word). Once in this subset or node, 
the system will search and recognize only the words included in this subset. This 
increases both speed and accuracy, and allows for different output for a given 
input. 

For example, a subset of numbers may be entered with the node word 
"number". Only words representing those numbers contained within the node will 
be recognized (along with node words which exit the subset). This allows the use of 
homonyms (such as "two" and "to") without confusion. When in the subset of 
"numbers", the utterance "to" or "two" will produce an output of "2". When in 



other systems the utterance "to" or "two" will produce the output string of "to" (or 
any other preprogrammed output desired). 

The selected vocabulary can also be used to overcome problems related to 
cumbersome program commands or other often-forgotten commands through 
allowing for various input utterances to result in the same output string. For 
example, each computer network has a specific command to log off or check out of 
the system. These usually differ from system to system, and it may be difficult to 
remember which is required for each system. Programming three or four 
different utterances that produce the same correct output command will alleviate 
this problem (e.g., "log out", "log off, "check out", and "bye bye" might all 
correspond to the output string "LOGOFF A M"; saying any of them produces the 
desired result). 

Literature related to the area of speech recognition system vocabularies is 
referenced in Section 6 of Appendix Cl. 

C. ENVIRONMENTAL FACTORS 

The environment in which a system will be used can play a decisive role in the 
choice of the input device and the voice recognition system to be used. In a United 
Nations command center that is dark, noisy, and filled with people from many 
nations with varied languages and customs, typing commands to a computer in one 
language in a fixed syntax is not practical. A well-implemented voice recognition 
system can do this job faster and without the mistakes normally associated with 
human translators. This "Tower of Babel" in which one can communicate as if with 
one tongue can be implemented with current technology through proper design. 



References to environment-related studies and research are found in Section 1 
of Appendix C2. Subsets of these references, related to specific environmental 
factors, are provided in Sections 2 through 6 of Appendix C2. 

1. Multilingual Factors 

The UN example may be the extreme, but in this world of instant world- 
wide telecommunications, international businesses, and melting pot nations, 
computers frequently must interface with people who speak different languages. 
Voice recognition systems are unconcerned with what language is spoken. They 
operate by matching the pattern of a given voice input (utterance) with a known 
pattern and then outputting some predesignated command string, therefore acting 
somewhat like a translator. For example, three languages may be spoken in an 
office (English, Spanish, Hindi). The computer software requires input in English. 
It is impractical to teach all the personnel both English and the commands required 
to operate the computer. A voice recognition system could be installed that 
"understands" utterances in all three languages and outputs the English commands 
that the software requires. 

Research and other literature related to voice recognition with 
multilingual environments is found in Section 2 of Appendix C2. 

2. Multicultural Factors 

Multicultural factors arise when different people have different ideas, 
styles, or ways of doing things. All computer operating systems perform similar 
functions, but there are subtle differences in the way commands are activated. For 
example, for a simple file transfer, the UNIX operating system uses a specific 
syntax that is completely different from that used by an IBM operating system. 



Switching between MS-DOS, Z-DOS, Apple DOS, and the Macintosh operating 
systems usually will require the user to look up the desired commands. 

Voice recognition systems can ease these difficulties by doing the lookup 
for the user: the same phrase, "save and quit", can be programmed to produce the 
same result on all systems. Voice recognition can also help equalize the varied 
experience, training, and typing skills of workers or executives exposed to new 
systems or new situations. 

Literature sources related to multicultural factors are referenced in 
Section 3 of Appendix C2. 

3. Command and Control Environments 

Military establishments have done much work toward application of voice 
recognition systems in the command and control environment. The result of this 
work has been the acceptance and implementation of operational voice recognition 
systems in both strategic and tactical command and control environments. Most of 
this research can also benefit civilian business and industry applications. A listing 
of current research relating the areas of voice recognition systems and command 
and control is provided in Section 4 of Appendix C2. 

4. High Noise Environments 

Voice recognition systems have been used effectively in quiet office 
environments and also in noisy industrial assembly areas (noise levels in excess of 
100 db). Although voice recognition equipment manufacturers have endeavored to 
make their equipment work equally well in both environments, there are some 
locations where it is still too noisy for voice recognition systems to operate unaided. 
In such environments the use of a soundproof booth or a mask (such as a noise- 
reducing stenographer’s mask) can help; external noise is diminished and effective 
voice recognition can take place. 



Most researchers agree that, when using speaker dependent systems, 
"training" voice samples should be collected in the environment in which they will 
be used. This is especially true with noisy environments. 

Another method to improve voice recognition in a noisy environment is 
to use a speech enhancement algorithm. This is a software technique used to clean 
up the speech pattern before it enters the recognition device. A noise concealing 
microphone (like those that have been used in aircraft for years) also can be used. 
This microphone samples the environmental background noise and aids in 
canceling out this background noise prior to its being sent to the recognizer. 

When noise is a consideration in the environment, a close look at research 
in this area is critical. Even for quiet office environments, an understanding of 
noise as it relates to voice recognition is recommended. Most mechanical things 
make noise, some at frequencies that the human cannot hear or chooses to ignore 
due to familiarity. The noise of a car, airplane, copy machine, or elevator during 
training or execution of voice recognition commands can result in puzzling 
problems. Noise-related articles and research are listed in Section 5 of Appendix 
C2. 

5. Low-Light Environments 

Low-light environments include both dimly lit control rooms and 
completely darkened auditoriums. In these environments, lighting can interfere 
with the performance of the operators' primary mission. The cockpit of an aircraft 
and the bridge of a ship are specific environments where good night vision is 
paramount. During daylight, normal manual input devices are adequate. At night, 



a light source can have life-threatening consequences. A voice recognition system 
allows for sightless input of computer commands plus mobility. 

Voice recognition systems can be used to control the lights in a room. A 
more complex use would involve a microprocessor voice recognition system in a 
welders helmet that controls the welding unit, turning it on and off and also 
controlling the voltages or gas flow remotely. 

References relating voice recognition systems to low-light environments 
are listed in Section 6 of Appendix C2. 

D. SITUATIONAL FACTORS 

Situational factors covered in this study include (1) system use by a group, (2) 
use by an individual, and (3) use by handicapped persons. Appendix C3, Section 1, 
provides a complete list of voice recognition systems references related to such 
situational factors. 

1. Multiuser or Group Usage 

A multiuser system is a single system that is used by many people but only 
one at a time. Group usage is the use of a system by many people during the same 
time period. Both multiuser and group usage have similar problems and 
characteristics and have thus been grouped together in this study. 

Multiuser-oriented systems can be either speaker dependent or 
independent. They can use either continuous or discrete speech recognition 

algorithms. These terms are defined as follows. 

• Speaker Dependent Systems: require adaptation (or "training") of the voice 

recognition system to the speech characteristics of each user in order to 
achieve recognition. 

• Speaker Independent Systems: recognize speech regardless of the speaker, 

and without system training in recognition of individual speech 
characteristics of users. 



• Continuous Speech Recognition: the process of extracting information from 

strings of words even though the words run together as in natural speech. 

[Yeller 83] 

• Discrete (Isolated) Speech Recognition: the process of transforming discrete 

utterances (those with a significant pause between utterances) into computer- 

recognized speech or text. 

Although speaker independent, continuous systems are better suited and 
require less training for multiple users, other combinations should not be ruled out, 
as they offer some advantages in specific circumstances. If the group situation also 
involves environmental factors (such as in a multilingual, high noise command 
post), the difficulty of selecting a system is compounded. Speed or vocabulary size 
or robustness may dictate that a speaker dependent, discrete speech system be used, 
even though system training time is higher and sampling is required. 

Implementing voice recognition input to a Group Decision Support 
System (GDSS) is difficult since there are four basic GDSS typologies, each 
presenting its own unique problems. Figure 2.1 shows these four typologies. [Bui 
87]. 

Figure 2.1 (a) shows a bilateral relationship between a single-user- 
oriented DSS and a group of users, the later being considered as a whole. The 
purpose of such a DSS is in essence the same as a single-user DSS. [BUI 87] In this 
situation a voice recognition system that is robust enough to fit the needs of the 
group is required. If the size of the group is small and its composition constant, a 
discrete, speaker dependent system (requiring system training by the users) is 
practical. Otherwise, a speaker independent, continuous speech system would be 
most appropriate. With a varying group, the cost and time required to sample and 
train each user and the constraints on vocabulary size could be prohibitive. Figure 



2.1 (b) extends the previous typology to include a GDSS, and has the same 
associated problems. 




Figure 2.1. Typologies of Group Decision Support Systems [Bui 87] 

Figures 2.1 (c) and (d) illustrate a multilateral relationship between a 
member of a group (via a network of individual DSSs) and a GDSS. This typology 
allows the customization of individual DSSs to suit the needs of users. Currently 
the cost of a GDSS of this nature is too great for most user organizations; 
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centralized or off-site facilities (leased from or provided by a vendor), used by 
many diverse groups, are the norm. Requirements for minimal training time and 
the variability of users usually necessitate the use of a robust, speaker independent, 
continuous speech system. 

There is no perfect solution to all situations. Each installation should be 
evaluated on its own merit by well-informed analysts. Section 2 of Appendix C3 
provides references to research in this area. 

2. Individual Usage 

Voice recognition for individual usage offers the greatest possible 
number of options. Many factors can be considered when optimizing the system, 
which can be speaker dependent or independent, and use continuous or discrete 
recognizers. 

Voice recognition systems can also be used to augment other input 
devices. They can be used simultaneously with keyboards and pointing devices. In 
the fields of desktop publishing, graphics manipulation, or computer-aided design, 
the task of entering text is secondary to the drawing of shapes or manipulation of 
objects on a screen. A voice recognition system or a "talkwriter" can be used to 
perform a text entry task and thus not break the flow of carrying out the primary 
task. 

The most important constraint when designing a system is the time and 
effort required for training. References relating voice recognition systems to 
individual users are provided in Section 3 of Appendix C3. 

3. Handicap Situations 

A physical handicap does not impair a person's mental ability or ability to 
produce. Just as a person with an amputated leg is given a prosthetic device to allow 



mobility, a voice recognition system can be used as a prosthesis that can compensate 
for some physical handicaps. Much work has been done in this area to bring 
independence, mobility, and productivity to the handicapped. Voice recognition 
systems not only can be used by the handicapped to operate computers, but they also 
can be used to control or manipulate other mechanical devices. 

Wheelchairs, prosthetic devices, communication devices, environmental 
controls, and many other systems may be controlled via the voice. The highly 
individual nature of designing a voice recognition system for the handicapped can 
result in the use of small, lightweight, power efficient, portable units, fine-tuned 
for the user and his or her needs. 

Research related to the handicapped and voice recognition is located in 
Section 4 of Appendix C3. Much of this research is equally applicable for use with 
non-handicapped individuals. 

E. QUANTITATIVE FACTORS 

Some of the benefits or advantages of computer voice recognition systems are 
subjective (user convenience or preference). Other aspects are undeniably 
quantitative. These include response and task time, accuracy, speed of entry, ease 
of use, and user productivity. References that evaluate or discuss these quantitative 
measures are found in Section 1 of Appendix C4. 

1. Time 

Time savings can be measured in many ways. Baker cites data from 
experiments that show communications via typewriter or hand-writing cannot even 
approach speech, in terms of time or task efficiency [Baker 84]. Time saving, in 
terms of hours required to train the user on the system or in actual hours saved by 
the use of voice recognition, are significant, especially in common environments. 
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As voice recognition systems become commonplace and familiar, the time saved in 
training personnel is expected to increase. 

References in the area of response and task time, related to voice 
recognition systems, are included in Section 2 of Appendix C4. 

2. Accuracy 

One of the selling points of voice recognition systems is the accuracy of 
task performance. Once an utterance is correctly "understood", the system will 
produce a precise and correct output. However, two types of errors may occur: 
rejection and misrecognition. Rejection is the inability of a recognizer to classify a 
utterance correctly. Misrecognition happens when a recognizer classifies an 
utterance as something other than what was spoken. Since misrecognition is 
potentially more serious, most good recognizers are designed to reject rather than 
guess at marginal pattern matches. 

Experiments have shown accuracy rates ranging from a high of 99.8 
percent to lows in the range of 88.6 percent. The accuracy required of a system 
depends on the criticality of its application and the consequences of errors in the 
entered data. 

Research has shown that 183 percent more errors occur during manual 
data manipulation (typing) than when a voice recognition system is used [Yellen 
83]. Common typing errors such as the transposition of numbers or letters are 
almost eliminated with voice recognition. Correct entry of numbers is especially 
important since automated spelling and grammar checkers can catch most letter 
transpositions. 

Voice recognition accuracy can be improved in many ways, as covered in 
the Training Factors Section of this Chapter. Briefly stated, recognition accuracy 
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depends primarily on how the equipment is trained and on the experience level of 
the speaker. Computer experience, time of week, accent, vital capacity and rate of 
air flow, speaker cooperativeness, and anxiety all affect accuracy to a lesser extent. 
References providing other data concerning accuracy are included in Section 3 of 
Appendix C4. 

3. Speed of Entry 

Most researchers agree that speech input is faster than keyboard input. 
Most individuals can speak twice as fast as the average typist can type. With a 
greater number of nontypists gaining access to computers, faster input modes are 
needed. The Macintosh personal computer from Apple uses a pointing device, 
pull-down windows, and other enhancements (which augment the keyboard) to 
produce a more natural interface. Experiments evaluating the Macintosh's pull- 
down windows in comparison with continuous voice recognition input 
demonstrated a distinct advantage in using continuous speech over the pull-down 
window technology of the Macintosh. [Sweeney 86] 

In other research, after only three hours of training, subjects were 17 
percent faster using voice entry than typing [Yellen 83]. 

References concerning task completion speed are listed in Section 4 of 
Appendix C4. 

4. Ease of Use 

Various studies have been carried out that demonstrate that speech input is 
easy to learn and easy to use. Users also develop a preference for speech input in 
time. References to these studies are located in Section 5 of Appendix C4. 



5. Productivity 

Computers excel in performing repetitious, time consuming, and boring 
tasks; humans do not. Thus productivity will be increased when such tasks can 
easily be turned over to a computer, especially if voice commands can be used to 
initiate the desired operations. 

One device that uses a voice system to increase productivity is the 
"talkwriter" or voice dictator. As the user speaks, words are recognized, entered 
into a file, and displayed on a screen. When more than one interpretation is 
possible, the system may provide a list of its best guesses; the user selects one. 
Better-developed models have very large vocabularies and automatic sentence 
punctuation. 

References relating voice recognition systems and productivity are listed 
in Section 6 of Appendix C4. 

F. TRAINING FACTORS 

Training of the user and the voice recognition system is one of the most 
important considerations in the effective implementation of systems. Methods of 
training depend on the type of voice system being implemented: speaker dependent 
or independent systems, and continuous or discrete speech systems. Certain 
training techniques have been developed that can improve recognition accuracy and 
reduce errors. The complete list of references to training is found in Section 1 of 
Appendix C5. 

1. Speaker Dependent Systems 

Speaker dependent systems require that samples of the potential user’s 
voice be placed in computer memory. The system basically is tuned for each user's 
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voice. Usually these systems work better than a speaker independent systems 
because the dependent system contains samples of the actual user's voice. [Poock 83] 
Speaker dependent systems are well suited to situations where the same 
users perform the same job day in and day out. However, consistency is also a key 
element in successful recognition accuracy: a speaker may talk quite differently 
when training the machine than during operational use. Whenever possible 
training should be conducted in the same environment as the equipment will be 
operated in, to minimize variability that may affect recognition accuracy. Other 
factors that affect training and recognition accuracy are age, physical condition, 
fatigue, stress (emotional or physical), time of week, breath noise, microphone 
placement, familiarity, illness, peer pressure, workload, and external noise 
changes. When changes must occur, a new "training" session will usually retune 
the system and restore accuracy. 

Vocabulary size also affects recognition accuracy. As familiarity with a 
voice recognition system increases and the vocabulary is expanded, there will be 
more utterances that sound alike or similar to the recognizer; the system may start 
to reject words as unrecognized that formally were accepted. To improve 
recognition of troublesome words, using duplicate words trained separately 
sometimes will increase performance of that particular word. 

References to current research related to speaker dependent systems are 
listed in Section 2 of Appendix C5. 

2. Speaker Independent Systems 

A speaker independent speech system contains algorithms that can handle 
many different voices and dialects. The system is designed to recognize the voice 
of anyone who uses it, and thus is useful when many people are expected to operate 
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it daily. Unlike speaker dependent systems, speaker independent systems do not 
require samples of a given user's voice. As a result, speaker independent systems 
do not usually perform as well as speaker dependent systems that are tuned to a 
specific user's vocal characteristics. 

Vocabulary size and structure play an especially important part in voice 
recognition accuracy with speaker independent systems. As the size of the 
vocabulary increases, the possibility of confusion between words also increases 
since there is a greater chance that there will be similar sounding words. 

References related to speaker independent voice recognition systems are 
listed in Section 3 of Appendix C5. 

3. Continuous Speech Recognition 

Continuous or connected speech recognition systems can extract 
information from strings of words even though the words run together as in 
natural speech. Continuous speech is much more natural for humans to use than is 
discrete speech, which requires pauses between utterances. During the 1970s, most 
voice recognition systems used discrete speech. More recently, many accurate and 
inexpensive connected speech systems have been developed. 

Continuous speech systems can either be speaker dependent or 
independent. They usually involve larger vocabularies and require more powerful 
computers to run them. "Talkwriter" devices, discussed earlier, are connected 
speech systems with very large vocabularies 

A new approach to continuous recognition moves away from matching 
scheme algorithms to more flexible "phonetic" recognition schemes. Phonemes, 
the basic units of all speech, are the basis for phonetic recognition. This type of 
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system is trained using words incorporating all combinations of phonemes. The 
formulation of new words from these phonemes then is possible. 

References relating to continuous speech recognition systems are listed in 
Section 4 of Appendix C5. 

4. Discrete Speech Recognition 

Discrete or isolated speech recognition is the process of transforming 
discrete utterances into computer-recognized commands or text. Discrete speech 
contains a significant pause between utterances. A discrete speech recognizer must 
be able to detect a pause or low energy gap in order to function. Humans, however, 
sometimes find it difficult to speak with isolated words or broken phrases; hence 
discrete speech is not the most natural or desirable form of voice recognition. 

Until recently, almost all commercial applications of voice recognition 
technology have been discrete voice recognition systems. Discrete systems still 
offer some advantages over continuous recognition systems in the areas of speed, 
accuracy, and especially cost. An extensive listing of currently available 
commercial voice recognition systems is contained in Appendix E. Usually, unless 
a system is advertised as being continuous or connected, it is understood to be of the 
discrete variety. References contained in Section 5 of Appendix C5 provide 
additional information about discrete speech recognition. 

5. Recognition Accuracy 

Training plays perhaps the most significant role in recognition accuracy. 
Problems often arise as a result of changes, either with the user or within the 
environment. A computer usually is much more sensitive to these changes than is 
the human. An impartial observer trained to detect subtle changes and who 
understands the mechanics of the system may be needed for trouble shooting and 
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system repair. For speaker dependent systems, a simple retraining session may 
restore accuracy. The use of vocabulary nodes or subsets can increase both speed 
and accuracy (see the Vocabulary Factors Section). Duplicate words that result in 
the same output string may minimize rejection problems. Increasing the word 
recognition threshold may cause a higher rejection rate but can minimize 
misrecognition. 

Most systems come from the manufacturer adjusted to a optimal level; 
making changes may only decrease performance. The operations manual gives the 
best guidance to how this manipulation of the parameters of recognition can 
improve or detract from recognition. Publications listed in Section 6 of Appendix 
C5 provide additional information on recognition accuracy. 

G. HOST COMPUTER FACTORS 

Voice recognition systems have been used successfully on all types and sizes of 
computers. Appendix E lists current voice recognition systems and describes the 
host computers that each is compatible with. Voice recognition has also been used 
in aircraft and spacecraft control; telephones; robot control; in teaching people how 
to speak; and by the handicapped to control body limbs, home appliances, 
wheelchairs, and other conveyances. 

As voice recognition systems mature they will become smaller, cheaper, have 
larger vocabularies, and be more robust. As a result of this they are expected to 
find their way into more computer applications and be involved in more aspects of 
human endeavor. Section 1 of Appendix C6 provides a complete list of references 
concerning host computer applications for voice recognition. 
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1. Microcomputers 

Voice recognition systems can provide input to microcomputers via many 
different configurations, both internal or external. External "voice boxes" are 
perhaps the easiest to install and maintain. They are self- contained units that may 
have an interchangeable storage medium device that allows for swapping or 
installing vocabularies or software. These storage devices can take the form of 
floppy disks, tape cartridges, integrated circuit chip cartridges, compact optical 
disks, and other types of magnetic and optical storage devices. 

A replacement keyboard is one simple and inexpensive way to install a 
voice recognition system. These systems require no additional space or alterations 
to the microcomputer, they draw their power from the normal keyboard 
connection, and have ports for the voice recognition microphone and related 
switches built into the keyboard. Much of the unique voice recognition circuitry 
that usually is installed on an internal microcomputer board is in the keyboard. The 
disk storage device of the computer is used for its vocabulary and other software. 
Programming this type of system is easy as it mimics the normal keyboard 
keystroke inputs. Other software is unaffected by the system and is unaware that 
the user is entering commands via voice rather than by manual keystrokes. 

Another implementation is through the use of an internal plug-in circuit 
card. This card operates in a manner similar to that of the keyboard, with the 
microphone and switches plugging into the card. These cards may incorporate 
other functions such as a modem or speech synthesis unit. 

Some voice systems are actually incorporated into the basic design of the 
microcomputer and are internal and omnipresent to its operation. Specific 
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information on these and other microcomputer voice systems are referenced in 
Section 2 of Appendix C6. 

2. Mainframes 

Mainframe computers may be accessed by the same types of methods as 
those noted for microcomputers. Links from microcomputers used either as dumb 
or intelligent terminals also may be used for access. 

Because of the powerful processors and large, fast-access storage devices 
associated with mainframe computers, much research has been done with voice 
recognition related to large computers. Research literature concerning mainframe 
computers and other large computer applications of voice recognition systems is 
listed in Section 3 of Appendix C6. 

3. Networks 

Computer networks and voice recognition systems come as a natural 
extension of microcomputer and mainframe application of voice recognition. 
Separate vocabulary nodes or specialized vocabularies may be used when accessing 
different networks. Passwords and entry procedures can be incorporated into the 
output strings, removing much of the drudgery related to moving through a 
network. The implementation of speech recognition also allows the use of voice 
verification as an automatic entry and access device. 

Two of the largest networks used today are the telephone network and the 
automatic teller machine networks. Voice recognition systems have been proposed 
for these networks, and development efforts are underway. References related to 
voice recognition and networks are contained in Section 4 of Appendix C6. 



4. Types of Entry Required 

Data entry requirements vary from application to application. Voice 
input can be used to collect data, as in inventory control or quality control and 
assurance situations. Voice input can be used to input data or information into a 
computer, such as in order processing, or to manipulate data, as in automatic 
message preparation. Voice can be used to convert speech to text, as in the 
"talkwriter" or automatic dictation machines. Voice can verify data that has been 
entered by others or that has been mechanically or automatically entered via some 
other input device. Voice can be used to control industrial processes, machines, 
and robots. 

Each of these applications requires a different type of system to make it 
work optimally. References related to data entry systems are provided in Section 5 
of Appendix C6. 

H. EXPERIMENTS AND RESEARCH 

A vast amount of research has been conducted in both broad and specific areas 
of voice recognition. Section 1 of Appendix C7 contains references to this 
research. This research is further divided into logical groupings, to allow focused 
study. Section 2 of this Appendix covers research in the area of artificial 
intelligence. Section 3 looks at future research, that is, those areas in which new 
trends are developing or towards which research is predicted to move. Section 4 
deals with present research, covering work done in the last five years. Section 5 
includes literature related to research conducted prior to 1 January 1983. Many 
experiments and case studies have been conducted. Section 6 is devoted to these. 

A special area of interest has evolved relating the field of voice recognition to 
the area of natural language interfaces. Dobney states that natural language 
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interfaces and speech recognition are fifth generation concepts. A natural language 
interface allows a user to express his or her request in English. Certain difficulties 
arise when using naturally spoken English. The problem is related to the use of 
homonyms, such as "I heard the song" and "I saw a herd of buffalo". A related 
difficulty results when phrases sound similar, such as "I scream" and "ice cream". 
[Dobney 87] The human mind has developed ways to sort out these problems; 
humans understand the context of what is being said, and are sensitive to shifts in 
context. Dobney presents some interface complexities which natural language 
processing must address and resolve. Some of these are listed here to demonstrate 

the scope of this problem. 

• Time flies like an arrow 

Fruit flies like a banana. 

• You wouldn't recognize Mary now. She’s grown another foot. 

• Can anyone walk over Niagara Falls on a tightrope? 

• A sandwich is better than nothing. 

Nothing is better than a good square meal. 

Therefore a sandwich is better than a good square meal. [Dobney 87] 

The challenge will be to develop machines that will do what we mean, and not 
necessarily what we say. Literature documenting research dealing with natural 
language interfaces is found in Section 7 of Appendix Cl. 



III. RESULTS AND CONCLUSIONS 



A. RESULTS 

The primary objective of this thesis is to provide a single source of reference to 
enable the selection of an appropriate voice recognition system implementation for 
a given DSS or other computer application. Chapter II, Data Analysis, fulfills this 
objective by providing both a broad overview of voice recognition systems and 
their characteristics and a close-up view of specific categories within voice 
recognition. 

The second objective is to provide a reference guide to current voice 
recognition literature and research. Appendix C is such a guide. It contains an 
annotated bibliography and has subappendices that directly link this bibliography to 
specific areas of research that are discussed in Chapter II. An additional result of 
this study is Appendix D, a complete index of all publishers mentioned in the 
bibliography, which should facilitate retrieval of articles that might be difficult to 
locate. 

The third objective is to provide a current listing of all commercially available 
voice recognition systems. This listing is contained in Appendix E, and gives each 
manufacturer's name, address and phone number. The various types of voice input 
devices manufactured, their intended use, and their compatibility with current 
computer systems also are provided there. 

The overall goal of this study is to provide a useful guide to help in the decision 
making process concerning the implementation or the use of voice recognition 
systems. Information in this study can be used both as an introduction to voice 
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recognition systems and as a reference source to answer questions on specific 
topics. The direct linking of specific topics to a grouping of articles dealing with 
this topic allows use of this study as a ready reference source. 

B. CONCLUSIONS 

As discussed in Chapter I, the dialog component of decision support systems 
may be the weak link when implementing a DSS. By using voice recognition 
systems to optimize this dialog component, the overall DSS will benefit. 

As noted in the Voice Recognition Systems Section of Chapter I, voice 
recognition, as well as other fifth generation concepts is expected to be critical for 
the future of most computer applications. 

Research listed in the Human Factors Section of Chapter II has shown that 
stress may result from a fear of new technology. Fear of new technology is not a 
recent phenomenon. This fear of voice recognition systems often is a result of the 
user not being previously introduced to such systems. Fear also can result when the 
user is unaware of what voice recognition can actually do (and cannot do). 

Considering the importance of voice recognition and its proven value to human 
productivity, the volume of recent research is not increasing proportionally to its 
perceived importance. This is indicated by the amount of literature referenced 
throughout Chapter II. The volume of publications has not increased in recent 
years at the rate of studies done in earlier years. 
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IV. RECOMMENDATIONS 



It is recommended that designers and users of DSSs investigate voice 
recognition systems as a means of optimizing the dialog component of DSSs. As 
noted by Ralph Sprague, describing the future of Decision Support Systems, 
"Dialog will profit significantly from the inclusion of natural language processing 
techniques and voice recognition" [Sprague 87]. 

As the reality of fifth generation computer technology approaches, the use of 
"intelligent systems" such as natural language processing and voice recognition 
systems will allow for both flexible and natural input. Although no one input 
method is perfect or even appropriate for all uses, voice systems show promise for 
wider applications then presently are being implemented. 

Widespread acceptance of computer voice recognition can be encouraged by 
proper training and orientation of potential users of such systems. A good training 
and education program in the use and benefits of voice recognition will help 
smooth the path for voice recognition implementation. 

More research is needed in all areas of voice recognition. Only through 
continued research and experimentation can voice recognition systems develop and 
improve. The perceived recent lull in voice recognition research may in part be 
due to normal delays in the publishing process or to recent cutbacks of research 
funds. However, since the demand for better input methods continues, research 
must also continue. 

It is hoped that this study can help guide and inspire the use of voice 
recognition systems for decision support systems and other computer 
implementations. A tool has been provided that can enable quick reference to 
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literature related to specific areas of concern and research within the domain of 
computer voice recognition. Continued education and enlightenment should result 
in progress and greater acceptance of these systems. 
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APPENDIX A GLOSSARY OF TERMS 



Group Decision Support System (GDSSk a computer-based system that aims at 
supporting collective problem solving. A collective decision-making process can 
be viewed as a problem-solving situation in which there are two or more persons, 
(1) each of whom is characterized by his or her own perceptions, attitudes, 
motivations, and personality, (2) who recognize the existence of a common 
problem, and (3) who attempt to reach a collective decision. [Bui 86] 

Decision Support System (DSS1: the application of available and suitable 
computer-based technology to help improve the effectiveness of managed decision 
making in semi- structured tasks. [Keen 78] 

Voice Recognition (VRk the ability of a computer or device to recognize 
spoken words correctly and translate those sounds into a predetermined output 
string to a computer; also referred to as automatic speech recognition (ASR) 
[LeFever 87] 

Continuous Speech Recognition: the process of extracting information from 
strings of words even though the words run together as in natural speech. [Yeller 
83] 

Discrete (Isolated) Speech Recognition : the process of transforming discrete 
utterances (those with a significant pause between utterances) into computer- 
recognized speech or text. 

Utterance (Word) : may be a single mono- or polysyllabic word (e.g., select) or 
a combination of mono- or polysyllabic words joined into a phrase (e.g., select-the- 
first-choice). 
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Rejection: the inability of a recognizer to classify an utterance correctly. 
[Yellen 83] 

Misrecognition: classification by a recognizer of an utterance as something 
other than what was spoken. 

Speaker Dependent Systems: require adaptation (or "training") of the voice 
recognition system to the speech characteristics of each user in order to achieve 
recognition. 

Speaker Independent Systems: recognize speech regardless of the speaker, and 
without system training in recognition of individual speech characteristics of users. 
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to Listen?", Office Systems, v. 4, pp. 70+, April 1987. 

Discusses voice recognition systems pointing out that the ultimate 
"talkwriter" is still unavailable but rapid technology enhancements to these 
products require that office-system planners begin to take the technology 
seriously and start determining possible applications in their organizations. 

[Banatre 83] Banatre, J. P., Frison, P., and Quinton, P., "Network for the 

Detection of Words in Continuous Speech", Acta Informatica, pp. 431-448, 
January 1983. 

[Berman 84] Berman, J. V. F., "Speech Technology in a High Workload 

Environment", Proceedings of the 1st International Conference of Speech 
Technology, p. 69, October 1984. 

Speech technology can provide a man-computer interface which is 
qualitatively different from conventional systems. While speech constitutes 
a natural form of interpersonal communication, difficulties may occur when 
speech is used for a different purpose, due to the limitations of human 
information processing capabilities. These capabilities will be discussed and 
laboratory experiments described which demonstrate some underlying 
principles by which aspects of the task structure must be constrained, 
especially in a high-workload environment. These considerations should 
help system designers to maximize the potential benefits offered by speech 
technology, and minimize its impact on such diverse factors as multiple task 
performance and the limitations of human working memory. 

[Betterton 83] Betterton, Andrew, "Voice Recognition Moves Out of the 
Labs," Computer Data, v. 8, p. 6, October 1983. 
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Explains some of the advances that have been made in the voice recognition 
field but points out that the technology is still in its very early stages. 

[Bierfert 85] Bierfert, H., and Von Winfield, V., "Automatic Speech 

Recognition: From Theory to Practice", Sprache und Information, p. 340, 
1985. ISBN 34843191 19. 

[Biermann 84] Biermann, Alan W., Gilbert, Kermit C., and Fineman, Linda 
S., "Introducing Vips: A Voice-Interactive Processing System for 
Document Management", National Computer Conference Proceedings , pp. 
661-666, 1984. 

Describes a voice-interactive processing system that enables a user to display 
office-related data on a screen and manipulate it through a combination of 
voice and touch commands. 21 references. 

[Biermann 85-1] Biermann, A. W., Fineman, L., and Gilbert, K. C., "An 
Imperative Sentence Processor for Voice Interactive Office Applications", 
ACM Transaction on Office Information Systems, v. 3, pp. 321-346, 
October 1985. 

An interactive sentence processor that enables a user to manipulate text with 
connected speech and touch-graphics input is described. The processor 
includes capabilities to follow dialogue focus, execute a variety of 
imperative commands, and handle nested noun groups, pronouns, and other 
phenomena. A micro model of the system, giving enough enough structure 
to enable the reader to observe internal mechanisms in considerable detail, is 
included. This processor is designed to be transported to a number of other 
office automation domains such as calender management, message-passing, 
and desk calculation. Various examples and statistics related to its behavior 
in the text manipulation applications are given. The system has been 
implemented in PASCAL and can run on any machine that supports this 
language. 

[Biermann 85-2] Biermann, A., and others, "Natural Language With Discrete 
Speech as a Mode for Human-to-Machine", Journal of Communications of 
the ACM, v. 28, n. 6, pp. 628-635, June 1985. 

A voice interaction natural language system which allows users to solve 
problems with spoken English commands has been constructed. The system 
utilizes a commercially available discrete speech recognition which requires 
that each word be followed by approximately a 300 millisecond pause. In a 
test of the system, subjects were able to learn its use after about two hours of 
training. The system correctly processed about 77 percent of the over 6000 
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input sentences spoken in problem-solving sessions. Subjects spoke at the 
rate of about three sentences per minute and were able to effectively use the 
system to complete the given tasks. Subjects found the system relatively easy 
to learn and use, and gave a generally positive report of their experience. 

[Bisiani 84] Bisiani, R., "Computer Systems for High-Performance Speech 

Recognition", New Systems and Architectures for Automatic Speech 
Recognition and Synthesis, Springer- Verlag New York, Inc. pp. 169-190, 
July 1984. ISBN 0-387-15177-X. 

[Blunden 80] National Technical Information Service AD A-82/02, The 

Impact of Speech Input and Recognition Systems on the Communications 
Industry, by Brian Blunden, and others, p. 100, May 1980. Paper Industry 
Research Association. 

Investigates the impact of speech recognition on the communications 
industries and in particular its use as a speech input device to the printing 
industry. The study is based on interviews with senior executives and visits 
to research centers in various countries along with a literature survey. 

[Bridle 83] Bridle, J. S., Brown, M. D., and Chamberlain, R. M., 

"Continuous Connected Word Recognition Using Whole Word Templates", 
Radio and Electronic Engineering, v. 53 , pp. 167-175, April 1983. 

Machines that recognize isolated words from a small, predefined vocabulary 
have been commercially available for many years. The whole word pattern- 
matching principles used in these machines are described, and it is shown 
how these principles can be extended to deal with continuously spoken 
sequences of words. Details are given of the resulting connected word 
recognition algorithm w r hich has already been implemented in the real-time 
hardware, which will be used to explore the full potential and limitations of 
the method in many different applications. 

[Bridle 84] Bridle, J. S., "Challenges and Opportunities in Techniques for 

Speech Pattern Processing", Proceedings of the 1st International Conference 
of Speech Technology, p. 191, October 1984. 

The types of speech knowledge needed for high-performance automatic 
speech recognition (ASR) and synthesis are outlined. The main lines of 
development of current speech recognition methods are sketched, 
emphasizing the 'stochastic model' approach. The possible role of speech 
synthesis as a basis for speech recognition is discussed. Further 
developments aimed at improving performance towards human listener 
levels are reported. A theme is the interaction between synthesis and 
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recognition: the most promising automatic speech recognition methods can 
be viewed as searches for the inputs to pattern synthesis systems that are 
most likely to generate the unknown speech patterns; current speech 
synthesis provides a useful basis for improved speech recognition models; 
and the latest idea in perception modelling is a parallel processing network 
that can behave as a recognizer or a synthesizer, depending on where the 
input is connected. 

[Bridle 87] Bridle, J. S., "Adaptive Networks and Speech Pattern 

Processing", Pattern Recognition Theory and Applications, pp. 221-222, 
June 1987, ISBN 0-387-17700-0. 

[Bristow 86-1] Bristow, Geoff, "The Speech Recognition Problem", pp.3-17, 
in: Electronic Speech Recognition, McGraw-Hill Inc., 1986, ISBN 0-07- 
007913-7. 

[Bristow 86-2] Bristow, G., Electronic Speech Recognition: Techniques, 
Technology, and Applications, McGraw-Hill Inc., 1986. ISBN 0-07- 
007913-7. 

[Bronson 85] Bronson, E., and Jamieson, L. H., "A Distributed Parallel 

Architecture for Speech Understanding", Algorithmically Specialized 
Parallel Computers, pp. 139-148, Academic Press, Inc., 1985. ISBN 0-12- 
654130-2. 

[Brown 87] Brown, Evelyn, "Voice Recognition: A Promising New 

Technology", Industrial Engineering, v. 18, pp. 40-41, September 1986. 

Looks at voice recognition highlighting the unlimited potential it holds for 
saving time and money as well as bolstering productivity as a technology that 
can complement bar coding and other forms of automatic identification. 

[Bruce 82] Bruce, B. C., Natural Communications Between Person and 

Computer, pp. 55-88, Lawrence Erbium Associates, 1982. ISBN 0-89859- 
191-0. 



[Calcaterra 82] Calcaterra, F., Application of Artificial Intelligence in Voice 
Recognition Systems in Micro Computers, Master's Thesis, Naval 
Postgraduate School, Monterey, California, March 1982. AD A1 15735. 

This research investigates the use of inexpensive voice recognition systems 
hosted by microcomputers. The specific intent was to demonstrate a 
measurable and statistically significant improvement in the performance of 
relatively unsophisticated voice recognizers through the application of 
artificial intelligence algorithms to the recognition of software. Two 
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different artificial intelligence algorithms were studied, each with different 
levels of sophistication. Results showed that artificial intelligence can 
increase recognizer system reliability. The degree of improvement in 
correct recognition percentage varied with the amount of sophistication in 
the artificial intelligence algorithms 

[Cashen86] Cashen, F., "Speech I/O Products Offer Board-Level 

Solutions", Computer Design, pp. 36-N40, 15 March, 1986. 

Lower IC memory prices, more powerful digital processors, better 
algorithms, and a proliferation of personal computers are the many factors 
helping board-level speech I/O products come of age. Voice synthesis and 
recognition boards that plug into the expansion slots are now available for 
just a few hundred dollars. 

[Cater 84] Cater, J. P., Electronically Hearing Computer Speech 

Recognition, pp. 263+, Howard W. Sams & Co., Inc., 1984. ISBN 
067222 173X. 

The subject of computer speech recognition covers eleven chapters in this 
book. 

[Cavazza84] Cavazza, M., Ciaramella, A., and Pacifici, R., 

"Implementation of an Acoustical Front-End for Speech Recognition", New 
Systems and Architectures for Automatic Speech Recognition and Synthesis, 
pp. 215-223, Springer-Verlag New York, Inc., July 1984. ISBN 0-387- 
15177-X. 

[Cerf-Danon 87] Cerf-Danon, H., and others, "Speech Recognition Experiment 
with 10,000 Word Dictionary", Pattern Recognition Theory and 
Applications, pp. 203-209, June 1987. ISBN 0-387-17700-0. 

[Clements 87] Clements, Mark A., "Voice Recognition Systems Can Be 
Designed to Serve a Variety of Purposes", Industrial Engineering, v. 19, pp. 
44+, September 1987. 

Examines the technology of voice recognition, discusses the state of the art at 
present, criteria for choosing a system, and the tradeoffs that are necessary 
for this purpose. 

[Cochran 83] Cochran, D. J., and Riley, M. W„ "Data Input By Voice", 

Computer Industrial Engineering, pp. 115-120, 1983. 
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[Cole 85] National Technical Information Services PB87-214680/WCC, 

Research on Feature-Based Systems for Speech Recognition, by R. A. Cole, 
p. 17, July 1985. 

The goals of the research were to (1) develop a system on a research 
computer to perform speaker-independent recognition of connected digits, 
(2) analyze the algorithms to determine the processing and memory 
requirements of the system, and (3) determine the feasibility of building a 
hardware device to run the algorithms in real-time. All goals were either 
met or exceeded with the development of a powerful new technology for 
computer speech recognition. This technology is called feature-based 
recognition because the perceptually important features of the speech signal 
are used to make decisions about what was said. 

Potential applications include voice telephone dialing, voice data entry, and 
voice control of devices and processes. 

[Connolly 86] Connolly, J., and others, "Automatic Speech Recognition 
Based on Spectrogram Reading", International Journal of Man-Machine 
Studies , v. 24, pp. 61 1-N621, June 1986. 

An approach to the problem of automatic speech recognition based on 
spectrogram reading is described. Firstly, the process of spectrogram 
reading by humans is discussed, and experimental findings presented which 
confirm that it is possible to learn to carry out such a process with some 
success. Secondly, a knowledge-engineering approach to the automation of 
the linguistic transcription of spectrograms is described and some results are 
presented. It is concluded that the approach described here offers the 
promise of progress towards the automatic recognition of multi-speaker 
continuous speech. 

[Conrad 83] Conrad, Ann E., "Voice Systems Are A 'Sound' Investment", 

Data Management, v. 21, pp. 20-23, November 1983. 

Cautions information processing managers to keep in mind several 
important evaluation criteria when implementing voice processing 
technology. 

[Cook 85] Cook, James, "Data Entry Via Voice Recognition", 

Manufacturing Systems, pp. 28+, 1985. 

Discusses Speaker Dependent Recognition (SDR) technology now used in 
factory data collection applications 
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[Dabbagh86] Dabbagh, H., Damper, R., and Guy, D., ’Transparent 

Interfacing of Speech Recognizers to Microcomputers", Microcomputers & 
Microsystems, pp. 371-N376, September 1986. 

[Damper 84] Damper, R. I., "Speech Technology and the Disabled", 

Proceedings of the 1st International Conference of Speech Technology, p. 
135, October 1984. 

Disabled people are likely to be among the earliest users of emergent speech 
technology. New capabilities of speech synthesis and recognition offer 
much promise in assisting disabled members of society to lead fuller lives, 
whether their handicap be sensory or physical. Speech synthesis can give the 
non-vocal a voice and make printed and "electronic" information accessible 
to the blind. Speech recognition devices, although having only rudimentary 
capability, are starting to make voice control of machines a practical 
proposition for people with limited physical ability. The deaf can also look 
forward to improved speech-reading ("lip-reading") aids based on new 
speech analysis hardware and software. However, there are many 
limitations to this technology and our understanding of how best to apply it. 
These difficulties are likely to severely curtail the success of attempts to 
harness speech technology to serve the disabled for some time to come. 

[Damper 85] Damper, R., "Voice-Input Aids for the Physically Disabled", 

International Journal of Man-Machine Studies, pp. 541-553, 1985. 

[De Mori 84] De Mori, R., and LaFace, P., "On the Use of Phonetic 

Knowledge for Automatic Speech Recognition", New Systems and 
Architectures for Automatic Speech Recognition and Synthesis, pp. 569- 
591, Springer-Verlag New York, Inc., pp. 2-14, July 1984. ISBN 0-387- 
15177-X. 

[De Mori 85-1] De Mori, R., LaFace, P., and Mong, Y., "Parallel Algorithms 
for Syllable Recognition in Continuous Speech", IEEE Transactions on 
Pattern Analysis and Machine Intelligence, pp. 56-69, January 1985. 

[De Mori 85-2] De Mori, R., "Algorithms and Architectures for Speech 
Understanding", Algorithmically Specialized Parallel Computers, pp. 149- 
158, Academic Press, Inc., 1985. ISBN 0-12-654130-2. 

[De Mori 85-3] De Mori, R., "Parallel Algorithms for Hypothesis Generation 
in Continuous Speech", Computer Architectures for Spatially Distributed 
Data, pp. 375-391, Springer-Verlag New York, Inc., 1985. ISBN 0-387- 
12886-7. 
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[De Mori 87-1] De Mori, R., Lam, L., and Probst, D., Rule Based Detection of 
Speech Feature for Automatic Speech Recognition, pp. 155-179, Cambridge 
University Press, 1987. ISBN 0-521-30983-2. 

[De Mori 87-2] De Mori, R., "Knowledge-Based Computer Recognition of 
Speech", Pattern Recognition Theory and Applications, pp. 433-450, 9-20 
June 1987. ISBN 0-387-17700-0. 

[Dillman 84] Dillman, R., and others, "Reduction of Complexity in Speech 

Recognition", Proceedings of the 1st International Conference of Speech 
Technology, p.77, October 1984. 

The development of speech recognition and speech synthesizer improves the 
quality of man-robot interaction essentially. For an unexperienced user it is 
easier to become familiar with speech communication than with the 
sometimes hard to understand typed "robot languages". Both programming 
of robots as well as verification of their actions (e.g., test of robot 
programs) can be supported by acoustical interfaces. In this paper a speech 
recognition and speech synthesizer system will be presented which has a high 
recognition rate, extendable vocabulary, a sentence generator, and an 
interface to robot controls. A fine state automata model is used to reduce the 
search space and time for speech recognition. 

[DI Martino 84] DI Martino, J., "Dynamic Time Warping Algorithms for 
Isolated and Connected Word Recognition", New Systems and Architectures 
for Automatic Speech Recognition and Synthesis, pp. 405-418, Springer- 
Verlag New York, Inc., 2-14 July 1984. ISBN 0-387-15177-X 

[EDP Anal 83] "Is "Voice" In Your Future Systems?" EDP Analyzer, v. 21, 
pp. 1-12, August 1983. 

Discusses recent developments in the voice field such as the "processing" of 
voice messages and the appearance of new voice products making voice 
worth investigating for future information systems. Looks at four areas 
involved in voice processing, voice syntheses, voice recognition, voice mail, 
etc. 

[Elenius 86] Elenius, K., and Blomberg, M., Voice Input for Personal 

Computers, pp. 361-372, McGraw-Hill Inc., 1986. ISBN 0-07-007913-7 

[Elster80] Defense Technical Information Center, AD A106138, The 

Effects of Certain Background Noise on the Performance of a Voice 
Recognition System, by R. Elster, September 1980. NPS-55-80-010. 
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[Epstein 86] Epstein, Jonathan, "Voice Recognition: Six Users Pioneer 

Cost-Saving Applications", Computer World, v. 20, pp. 79-82, 16 June 
1986. 

Points out that innovative users who design voice applications for their 
personal productivity find they have an exciting and profitable tool. 

[Eskenazi 83] Eskenazi, M., and Lienard, J. S., "Recognition of Steady-State 
French Sounds Pronounced by Several Speakers: Comparison of Human 
Performance and an Automatic Recognition Algorithm", Speech 
Communication, v. 2, n. 2-3, pp. 173-177, July 1983. 

[Fallside 85] Fallside, F., and Woods, W., Computer Speech Processing, p. 

506, Prentice/Hall International, 1985. ISBN 0-13-163841-6. 

[Fallside 86] Fallside, F., Harrison, T., and Prager, R., "Boltzmann 

Machines for Speech Recognition", Computer Speech and Language, pp. 3- 
N27, March 1986. 

[Fisher 86] Fisher, M., Voice Control for the Disabled, pp. 309-321, 

McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

[Ford 83] National Technical Information Service 84003837, How to 

Talk to Your Computer, Literally, by W. Ford, p. 8, 1983. Department of 
Energy 

Provides guidelines for selecting and using voice I/O hardware including 
vocabulary size, method of training, upload/download capabilities, user 
control of recognition parameters, package form factor, and information 
returned to the user. 

[Foster 82] Foster, Richard A., "A Word About Its Future", Computer 

World Extra, v. 16, pp. 39-40, 17 March 1982. 

Discusses technological developments in voice recognition and response and 
explains how to determine whether a company has a need for it. 

[French 83] French, B. A., Some Effects of Stress on Users of a Voice 

Recognition System: A Preliminary Inquiry, Master's Thesis, Naval 
Postgraduate School, Monterey, California, March 1983. AD A 128559. 

This thesis is an attempt to see if placing users of such equipment under time- 
induced stress has an effect on their percent correct recognition rates. 
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[Friedman 84] Friedman, Elliot, "Voice Technology Coming for 
Microcomputers in 1984", Computer World, v. 18, p. 64, 30 April 1984. 

Informs that, in 1984, voice applications for microcomputers will be 
broadly available for the first time, able to be integrated with other 
applications. 

[Frison 84-1] Frison, P., and Quinton, P., "Systolic Architectures for 

Connected Speech Recognition", New Systems and Architectures for 
Automatic Speech Recognition and Synthesis, pp. 145-167, Springer-Verlag 
New York, Inc., 2-14 July 1984. ISBN 0-387-15177-X. 

[Frison 84-2] Frison, P., "An Integrated Systolic Machine for Speech 

Recognition", VLSI Algorithms and Architectures, pp. 175-188, Elsevier 
Science Publishers Co., Inc., 1984. ISBN 0-444-87662-6. 

[Good 84] Good, Robert A., "Voice Input/Output at Less Cost", Systems 

&. Software, v. 3, pp. 141-144, May 1984. 

Describes a new software package that speeds the development of voice I/O 
for the IBM PC and multichannel systems. 

[Gould 83] Gould, J. D., and Boies, S. J., "Human Factors Challenges in 

Creating a Principal Support Office System-The Speech Filling System 
Approach", ACM Transactions on Office Information Systems, v. 1, pp. 
273-298, October 1983. 

This paper identifies the key behavioral challenges in designing principal- 
support office systems and our approaches to them. These challenges 
included designing a system which office principals would find useful and 
would directly use themselves. Ultimately, the system, called the Speech 
Filing System (SFS), became primarily a voice store and forward message 
system with which compose, edit, send, and receive audio messages, using 
telephones as terminals. Our approaches included behavioral analyses of 
principals’ needs and irritations, controlled laboratory experiments, several 
years of training, observing, and interviewing hundreds of actual SFS users, 
several years of demonstrating SFS to thousands of potential users and 
receiving feedback, empirical studies of alternative methods of training and 
documentation, continual major modifications of the user interface, 
simulations of alternative user interface, and actual SFS usage analyses. The 
results indicate that SFS is now relatively easy to learn, solves real business 
problems, and leads to user satisfaction. 
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[GovDatSys 86] "Computers Have Learned To Listen", Government Data 
Systems, v. 15, pp. 86+, July/ August 1986. 

Tells how breakthroughs in speech recognition have spurred search for 
VARs and integrators to develop applications. 

[Green 83] Green, T. R. G., and others, "Friendly Interfacing to Simple 

Speech Recognizers", Behavior and Information Technology, pp. 23-38, 
January-March 1983. 

[Green 85] Green, Phil, "Speech Recognition-What is Happening Now?", 

Computer Bulletin, v. 1, pp. 5-7, September 1985. 

Points out that there is a widespread feeling in the speech research 
community that automatic speech recognition is rapidly coming of age. 

[Gubrynowicz 84] Gubrynowicz, R., Le Guennec, L., and Mercier, G., Detection 
and Recognition of Nasal Consonants in Continuous Speech-Preliminary 
Results, pp. 613-628, Springer-Verlag New York, Inc., July 1984. ISBN 0- 
387-15177-X. 

New systems and architectures for automatic speech recognition and 
synthesis 

[Haas 84] Haas, M., "The Texas Speech Command System", Bvte, pp. 

341-348, June 1984. 

You can now give voice commands to the TI Professional Computer or use 
it as an answering machine and a smart telephone. 

[Hager 86] Hager, Peter, "Breakthroughs Said to be Ahead for Voice 

Recognition", Government Computer News, v. 5, p. 40, 29 August 1986. 

Points out that the use of voice recognition as a means of inputting data is the 
most natural way to communicate with a computer, but it also offers users 
marked productivity gains over conventual keyboard input. 

[Harrison 84] Harrison, J. A., "Evaluation, Assessment and Selection of 

Speech Products for Use in Applications", Proceedings of the 1st 
International Conference of Speech Technology, pp. 49-56, October 1984. 

This paper offers pragmatic guidance to non speech specialists on the 
evaluation and assessment of the relative merits of speech products against 
an identified need. It covers the formulation of detailed requirements, the 
difficulties of specifying performance in simple terms, potential methods of 
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evaluation, and some pitfalls to avoid. It concludes with the opinion that the 
spread in the use of speech technology depends largely on non specialists 
learning to apply what is available. 

[Haton85] Haton, J. P., "Artificial Intelligence for Automatic Speech 

Understanding", Technology and the Science of Informatics, pp. 265-287, 
May/June 1985. 

[Haton 87] Haton, J., Fundamentals in Computer Understanding: Speech 

and Vision, p. 276, Cambridge University Press, 1987. ISBN 0-521-30983- 

2 . 

[Henkle83] Henkle, Tom, "Fewer Firms Developing Voice Systems: 

IRD", Computer World, v. 17, pp. 10+, 17 January 1983. 

Discusses w'hy conversing computers are still not a reality. 

[Hill 86] Hill, E. T., and Kotowski, L. B., Using Voice Recognition as 

an Input Medium to the JINTACCS Automated Message Preparation System 
(JAM PS), Masters Thesis, Naval Postgraduate School, Monterey, 
California, March 1986. 

This thesis investigates the interfacing of voice recognition, also known as 
automatic speech recognition (ASR), with the Joint Interoperability of 
Tactical Command and Control Systems (JINTACCS) Automated Message 
Preparation System (JAMPS). The voice recognition system we used is the 
Texas Instruments (TI) TI-SPEECH (tm) imbedded in the Texas 
Instruments Portable Professional Computer (PPC). We w'ere able to load 
the Joint Automated Message Preparation System software onto the Texas 
Instruments Portable Professional Computer hard disk. With the 
Vocabulary we built, we ran the Joint Automated Message Preparation 
System software on the Texas Instruments Portable Professional Computer 
using voice recognition. Our results indicate Automatic Speech Recognition 
has an application in message preparation during military operations. 
Automatic Speech Recognition could curtail the time to prepare messages, 
and thereby reduce the time element in the command and control process. 
We propose a measure of performance to test how much time might be saved 
by using Automatic Speech Recognition with Joint Automated Message 
Preparation System. We also suggest some areas for future research. 

[Hobbs 84] Hobbs, G. R., "The Application of Speech Input/Output to 

Training Simulations", Proceedings of the 1st International Conference of 
Speech Technology, p.121, October 1984. 
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In some command and control training situations the trainee is being 
instructed in a task which involves the use of a well defined and well 
structured command language to communicate with other people over a 
voice communications link. These training situations frequently require the 
use of additional experienced personnel to act as "stand-ins" at the end of the 
simulated link. 

This paper describes the selection of a suitable application, the construction 
of and early experience with an experimental Air Traffic Control Trainer 
system, the first phase of which was completed early in the second quarter of 
1984. The system uses speech recognition and synthesis under the control of 
a computer to simulate the action of the "stand-in", normally known as the 
blip driver in this type of system. The computer is also used to sequence the 
training scenario. 

[Howell 83] Howell, P., "The Extent of Coarticulatory Effects: 

Implications for Models of Speech Recognition", Speech Communications , 
v. 2, n. 2-3, pp. 159-163, July 1983. 

[Hunt 83] Hunt, M., Lenning, M., and Mermelstein, P., Use of Dynamic 

Programming in a Syllable-Based. Continuous Speech Recognition System, 
pp. 163-188, Addison-Wesley Publishing Co., 1983. ISBN 0-201- 

07809-0. 

[Hunter 85] Hunter, Phillip, "Speak and You Shall be Answered", IBM 

User, pp. 43+, November 1985. 

Looks at voice recognition systems which are finally moving ahead due to 
recent breakthroughs in technology. 

[Int Res Dev 80] International Resource Development, Inc., Speech 
Recognition and Computer Voice Synthesis, p. 177, 1980. 

Explores present and future applications of speech recognition and synthesis 
including commercial applications. Current and potential suppliers are 
reviewed, with detailed information on shipment levels, market shares, and 
strategies included. Companies discussed include Threshold Technology, 
Perception Technology, IBM, and Verbex. Applications of large system 
integration technology for advanced speech recognition and synthesis are 
discussed. 

[Int Res Dev 85] International Resource Development, Inc. Report 644, Speech 
Recognition & Voice Synthesis, 1985. 
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[Int Res Dev 87] International Resource Development, Inc. Report 702, 
Corporate Talkwriters Voice Mail &. Speech Processing in Office 
Automation, 1987 

[Ivall 86-1] Ivall, T., Commercial Speech Recognizers, pp. 216-233, 

McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

[Ivall 86-2] Ivall, T., Unking Recognizers to Computers, pp. 234-243, 

McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

[Jinper 85] Jinper, X., Feng, Y., and Zhiqiang, T., "A Speech Recognition 

Interface to a Microcomputer", Chinese Journal of Computers, pp. 213-222, 
1985. 

[Johnson 85] Johnson, S. R., Connolly, J. H., and Edmonds, E. A., 

"Spectrogram Analysis: A Knowledge-Based Approach to Automatic 
Speech Recognition", Research and Development in Expert Systems, pp. 95- 
103, Cambridge University Press, 1985. ISBN 0-89797-149-0. 

[Johnson 86] Johnson, P., Long, J., and Visick, D., "Voice Versus 

Keyboard: Use of a Comparative Analysis of Learning to Identify Skill 
Requirements of Input Devices", People and Computers: Designing for 
Usability, pp. 546-562, Cambridge University Press, September 1986. 
ISBN 0-521-33259-1. 

[Joost83] Joost, M. G., Hosni, Y. A., and Petry, F. E., "Voice 

Communication With Computers: A Primer", Computers and Industrial 
Engineering, pp. 101-114, 1983. 

[Keller 85] Keller, Erik L., "Voice Recognition Starts Sounding Off", 

Systems & Software, v. 4, pp. 55+, March 1985. 

Reports that voice recognition is being used increasingly in manufacturing 
applications due to price reduction, PC use, and interest by original- 
equipment manufacturers. 

[Koelsch 87] Koelsch, James R., "Talk to Your Computer, It Understands", 

Product Engineering, pp. 44+, April 1987. 

Reports that manufacturers often use computers to prompt machine 
operators, and to verify that workers enter data correctly. 

[Kohonen 84] Kohonen, T., and others, "On-Line Recognition of Spoken 

Words from a Large Vocabulary", Information Sciences: An International 
Journal, v. 33, pp. 3-30, 1984. 
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It is demonstrated in this paper that a real-time, large-vocabulary, isolated- 
word speech recognition system can effectively be implemented using the 
following two-stage organization: (1) conversion of the speech signal into 
phonemic transcriptions, (2) recognition of phonemic transcriptions by 
advanced searching methods. A comparison of several alternatives for the 
first stage has indicated that the best accuracy is achieved by the leaming- 
subspace method. 

For the second stage the authors recommend fast string searching by 
redundant hash addressing combined with subsequent probabilistic analysis. 
The above system has been implemented in a minicomputer environment. 

[Korzeniowski 86] Korzeniowski, Paul, "First User not Mum", Networks World, 
v. 3, pp. 13-14, 24 March 1986. 

Looks at AT&T’s pilot project Conversant 1 Voice System, a speech 
recognition system that allows users to input data via spoken words which 
the system translates into data. 

[Kurzweil 86] Kurzweil, R., "The Technology of the Kurzweil Voice 
Writer", Byte, pp. 177-N186, March 1986. 

The present office system provides a clue to future applications for the deaf. 

[Kuzela86] Kuzela, Lad, "Voice Technology: Now They’re Listening", 

Industrial Week , v. 229, pp. 35-37, 28 April 1986. 

Shows that after years of disappointing responses in courting potential users, 
vendors of computerized voice systems are now making headway. 

[Lea 86] Lea, W., "The Elements of Speech Recognition", pp. 49-129, 

in: Electronic Speech Recognition, McGraw-Hill Inc., 1986. ISBN 0-07- 
007913-7. 

[LeFever 87] LeFever, Michael A., Speech Recognition in a Command and 

Control Workstation Environment, Master’s Thesis, Naval Postgraduate 
School, Monterey, California, March 1987. 

This thesis investigates speech recognition in a command and control 
workstation environment. It discusses the Navy's need for a command and 
control workstation (CCWS) and the importance of the human interface 
design. 
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[Leggett 82] Leggett, John Joseph, "An Empirical Investigation of Voice as 

an Input Modality for Computer Programming", Computer Science, v.13, p. 
366, 1982. UMI order number: DA 83-06794. 

This dissertation discusses the design, implementation, and results of a 
controlled experiment to evaluate voice versus keyboard (the standard input 
mode) in a language-directed editing environment. Twenty-four subjects 
input and edited program segments under control of a language-directed 
editor via the two input modes. Measures of speed, accuracy, and efficiency 
were used to compare the two modes of input. 

[Levinson 86] Levinson, S., "Continuously Variable Duration Hidden 
Markov Models for Automatic Speech Recognition", Computer Speech and 
Language, pp. 29-N45, March 1986. 

[Llaurado 82] Llaurado, J.G., "Computerized Speech Recognition", 
International Journal Bio-Medical Computing, pp. 91-94, March 1982. 

[Lombardo 84] Lombardo, J. P., Using Continuous Voice Recognition 
Technology as Input Medium to the Naval Warfare Interactive Support 
System (NWISS), Master’s Thesis, Naval Postgraduate School, Monterey, 
California, April 1984. AD A7916. 

A great deal of research has been conducted in the past 20 years concerning 
the use of voice recognition equipment with computers. The goal of this 
research has been to improve the man-machine interface. With the 
breakthrough from discrete to continuous voice recognition technology in 
the 1970s, a large step toward that goal was taken. 

This thesis attempts to show that continuous voice recognition technology 
can be effectively applied in a highly interactive, computer-aided 
wargaming environment. Through analysis of the strictly-formatted 
command syntax of the Naval Warfare Interactive Simulation System 
(NWISS) and use of commercially available, innovative, continuous speech 
hardware and software, a new input medium was created for the user of that 
wargame. The true effectiveness of this application of voice recognition 
technology must still be tested. Plans for such testing are being made and, to 
that extent, the thesis objectives are partly met. 

[Longuet-Higgins 85] Longuet-Higgins, C., Tones of Voice: The Role of 
Information in Computer Speech Understanding, pp. 293-304, Prentice/Hall 
International, 1985. ISBN 0-13-163841-6. 



57 



[Lundquist 82] Lundquist, Eric, "Voice-Input Systems Make Inroads into 
Industrial Applications-Manufacturing", Mini Systems , pp. 165+, October 
1982. 

Suggests that voice-input systems are becoming more attractive; they are 
more reliable, cost less and have overcome hurdles of past efforts. 

[Mackie 87] Mackie, K., Katsch, R., and Dermody, P., "Assessment of 

Evaluation Measures for Processed Speech", Speech Communication, v. 6, 
n. 4, pp. 309-316, 1 December 1987. 

The present study uses a range of speech intelligibility measures to examine 
their effectiveness in the evaluation of highly intelligible processed speech. 
The results show that speech stimuli which are not differentiated by 
traditional intelligibility measures can be differentiated by more sensitive 
test methodologies. The results indicate the value of including more 
sensitive tests of speech intelligibility in evaluation protocols for processed 
speech. 

[Madron 84] Madron, Thomas, "Speech Systems Gaining Ground", 

Computer World, v. 18, pp. 73-7 4, 6 February 1984. 

Focuses on speech systems, and predicts that they are likely to become the 
major new I/O device of microcomputing in the middle to late 1980s. 

[Maenobu 84] Maenobu, K., "Speaker- Independent Word Recognition in 
Connected Speech on the Basis of Phoneme Recognition", Automatic Speech 
Recognition, pp. 31-62, July/August 1984. 

[Mariani 83] Mariani, J., and others, "A Man-Machine Speech 

Communication System Including Word Based Recognition and Text-to- 
Speech Synthesis", Proceedings of th tIFlP World Computer Congress, pp. 
673-679, 1983. 

Presents a man-machine speech communication system which is composed 
of a speech recognition module and a speech synthesis module, each 
implanted on a single board and using microprocessors. 18 references. 

[Martin 84] Martin, B. J., and Poock, G. K., "An Initial Applied Look at 

Stress and Voice Recognition", Journal of the American Voice Jnput/Output 
Society, v. 1, pp. 24-33, June 1984. 

[Martin 86] Martin, S., "Difficult Speech-Recognition Technology Shows 

Signs of Maturity", Computer Design, pp. 23-N29, 1 August 1986. 
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Speech synthesis/recognition is acknowledged as a powerful and natural 
human interface to a computer, and the economic fuel that funded past 
research is now being applied to product development as well. 

[Mascarenas 84] Mascarenas, John, "Voice Processing Creates a New 
Dimension in Speech: The Ability to Talk Without a Tongue", Computer 
World On Communications, v. 1, pp. 50-52, November 1984. 

Discusses the new technology that implements speech communications with 
computers and spans such applications as voice synthesis, voice recognition, 
and voice and text processing. 

[Mavaddat85] Mavaddat, F., and Cheng, S. K. S., "Word Recognition in a 
Reduced Linear Prediction Space", Pattern Recognition Letters , pp. 
185-190. May 1985. 

[McCracken 81] McCracken, Donald L., A Production System Version of 
Hearsay-II Speech Understanding System, p. 139, UMI Research Press, 
1981. 

Describes a detailed comparison of a reimplementation of the speech 
understanding system, HEARSAY-II, with its predecessor. 

[Meade 85] Meade, Jim, "Winning Small in Voice Recognition", 

Hardcopy, v. 14, pp. 20+, November 1985. 

Reports that by taking a limited vocabulary approach to voice recognition, 
DEC has engineered a viable adjunct to DECtalk. 

[Meisel 84] Meisel, W. S., "Speech-to-Text-Systems— The User's Needs", 

Proceedings of the 1st International Conference of Speech Technology, 
p.161, October 1984. 

A large vocabulary continuous speech recognition system which transcribes 
speech to computer-readable text is an attractive objective. It would allow a 
user to get his ideas in a computer without typing. In a practical product, 
limitations on vocabulary, accuracy, and the user's freedom to speak 
naturally diverge from the ideal. This article discusses acceptance of a 
speech-to-text product, and the probable time frame in which initial 
products will be available. 

[Meisel 86] Meisel, W., Towards The "Talkwriter" , pp. 338-348, 

McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 
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[Meloni 83] Meloni, H., and Guizol, J., "Identifying Pseudo-Phonetic 

Events for Automatic Word Recognition", (FRENCH), Speech 
Communication, v. 2, n. 2-3, pp. 211-214, July 1983. 

This paper describes the pseudo-phonetic decoding level of a speech 
recognition system. The signal representation is obtained by means of 
spectral and temporal parameters. Automatic segmentation and labeling 
algorithms produce a sequence of pseudo-phonetic classes which 
characterize the steady and transient parts of speech sounds. The definition 
of these segments is made up with pseudo-phonetic features. Prosodic 
information is carried out by some labels assigned to vocalic events. 

[Meloni 87] Meloni, H., Gispert, J., and Guizoni, J., "An Expert System 

for Analytic Word Identification in Continuous Speech", Expert Systems & 
Their Applications, 5th International Workshop, v. 2, pp. 1239-1250, 
Agence de l'lnformatique, 13-15 May 1987. ISBN 2-86581-0283-X. 

[Menke 87] Menke, Susan M., "Voice Recognition Applications Will 

Increase in 1987", Government Computer News, v. 6, pp. 44-45, 16 January 
1987. 

Suggests that by the turn of the century speaker-independent continuous 
voice recognition software is expected to contain a large enough vocabulary 
for general office use; in the meantime, voice systems, with all their 
problems, are being used in a variety of applications now. 

[Minault 87] Minault, S., and others, "An Expert System for Speech 

Recognition by Signal Segmentation", Expert Systems & Their 
Applications : 5th International Workshop, v. 2, pp. 1251-1266, Agence de 
l'lnformatique, 13-15 May 1987. ISBN 2-86581-0283-X. 

[Mod Mat 83] "Voice Recognition— Back Again, and Better," Modern 

Materials Handling, v. 38, pp. 52-53, 6 April 1983. 

Contends that in the near future voice recognition will be at the heart of 
office automation and communication networks. 

[Mokhoff 84] Mokhoff, N., "Voice I/O Adds New Dimension to Computer 
Interface", Computer Design, pp. 19-21, March 1984. 

[Moody 85] Moody, H. Gerald, "Voice Recognition: At the Threshold", 

Information Strategy, v. 1, pp. 40-42, Summer 1985. 
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Reports that while progress in voice recognition technology has been 
limited, the flurry of new product offerings in recent months may be a 
harbinger of faster progress in the future. 

[Moore 84-1] Moore, R., "Overview of Speech Input", Proceedings of the 

1st International Conference of Speech Technology, p. 25, October 1984. 

This paper is intended to provide a brief insight into some of the techniques 
that underlie contemporary automatic speech recognition systems. It is 
shown how the concept of 'whole-word pattern matching' has established 
itself as an important principle, and a range of such algorithms is discussed. 
It is also shown how techniques for isolated word recognition may be 
extended to the recognition of connected speech. It is concluded that, 
although current automatic speech recognition algorithms are still relatively 
unsophisticated, they nevertheless exhibit a level of performance which can 
be useful in a wide range of well constrained task environments. 

[Moore 84-2] Moore, R., "Systems for Isolated and Connected Recognition", 
New Systems and Architectures for Automatic Speech Recognition and 
Synthesis, pp. 73-143, Springer-Verlag New York, Inc., 2-14 July 1984. 
ISBN 0-387-15 177-X. 

[Murveit 83] Murveit, Hyman Jack, "An Integrated Circuit Based Speech 

Recognition System", Electronics and Electrical Engineering, v. 15, p. 96, 
1983, UMI order number: AD A84- 13527. 

A high performance, flexible, and potentially inexpensive speech 
recognition system is described in this report. The system is based on two 
special-purpose integrated circuits that perform the speech recognition 
algorithms very efficiently. One of these integrated circuits is the front-end 
processor. It computes spectral coefficients from incoming speech, 
normalizes these spectra and finds the start and end of words in the speech. 
It transmit these spectra to a second integrated circuit that compares them 
with spectra from a set of stored word templates. The system can compare 
an input word with one thousand word templates and respond to a user 
within one quarter of a second. The system normally responds to words 
spoken in isolation from a particular speaker; however it can be used with 
connected speech as well as in a speaker independent manner. Modifying 
speech recognition algorithms to work with specially designed integrated 
circuits is shown to permit even high performance algorithms to be 
performed inexpensively. Using techniques such as these speech recognition 
devices should have a large range of applications within the next few years. 
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[Myers 83] Myers, Edith, "If We Could Talk to the Terminals..." 

Datamation, v. 29, pp. 181+, October 1983. 

Presents an overview of the many companies that are working towards voice 
recognition as a necessary part of office automation. 

[Nakagawa84] Nakagawa, S. I., "Connected Spoken Word Recognition 
Algorithms by Constant Time Delay DP, O(n) DP and Augmented 
Continuous DP Matching", Information Sciences: An International Journal, 
pp. 63-86, July/ August 1984. 

[Neil 81] National Technology Information Service AD A103280, NPS- 

55-81-003, Examination of Voice Recognition System to Function in a 
Bilingual Mode , by D. E. Neil, and T., Andreason, February 1981. 

[Niemann 84] Niemann, H., and others, "The Speech Understanding and 

Dialog System EVAR", New Systems and Architectures for Automatic 
Speech Recognition and Synthesis, pp. 271-302, Springer-Verlag New 
York, Inc., 2-14 July 1984. ISBN 0-387-15177-X. 

[Niemann 85] Niemann, H., and others, "A System for Understanding 
Continuous German Speech", Information Sciences: An International 
Journal, pp. 87-113, 1985. 

[Nishida86] Nishida, S., "Speech Recognition Enhancement by Lip 

Information", SIGCHI Bulletin, pp. 198-N204, April 1986. 

[Nocerino 85] Nocerino, N., and others, "Comparative Study for Several 
Distortion Measures of Speech Recognition", Speech Communication, v. 4, 
n. 4, pp. 317-331, December 1985. 

Local spectral distortion measures are commonly used to measure the 
similarity (or spectral distance) between two given short-time spectra. In 
this study we compared several different distortion measures including the 
Itakura-Saito (IS) distortion measure, the log likelihood ratio (LLR) 
distortion measure, the likelihood ratio (LR) distortion measure, the 
cepstral (CEP) distortion measure, and two proposed perceptually based 
distortion measures, the weighted Likelihood Ratio (WLR) and the weighted 
slope metric (WSM) distortion measures, in terms of their effects on the 
performance of a standard dynamic time warping (DTW) based, isolated 
word, speech recognizer. Two modifications of the basic forms of each 
measure were also investigated, namely, a Bark-scale frequency warping 
and the incorporation of suprasegmental energy information. All distortion 
measures and their modifications were tested on an alpha-digit vocabulary, 
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4-talker, telephone recording data base. The results can be summarized as: 
(1) All LCP-based distortion measures performed reasonably well. The 
LLR and WSM distortion measures gave the highest recognition accuracy, 
while the IS distortion measure gave the lowest score; (2) Whereas the 
addition of suprasegmental energy information helped the recognition 
performance, the use of gain and absolute loudness degraded the 
performance; (3) Bark-scale frequency warping did not, at least for the 
highly bandlimited telephone data base we tested, perform as well as its 
unwarped counterpart; (4) The WLR distortion measure did not perform as 
well as its unweighted counterpart. 

[NT1S81] National Technical Information Service PB82-801051, 

Speech Recognition by Computer , p. 300, October 1981. 

Presents investigations on the recognition, synthesis, and processing of 
speech by computer and includes research on the acoustical, phonological, 
and linguistics processes necessary in the conversion of the various 
waveforms by computers, in a bibliography containing 294 citations. 

[NTIS 86-1] National Technical Information Service PB86-852787/WLI, 

Speech Synthesis and Speech Recognition by Computer , January 1985- 
December 1985, (Citations from the INSPEC: Information Services for the 
Physics and Engineering Communities Database), p. 166, December 1985. 

Provides a bibliography that contains citations concerning the principles, 
designs, development, and various applications of computerized speech 
synthesis and speech recognition. 

[NTIS 86-2] National Technical Information Service PB86-871498/WLI, 

Computer Voice Recognition: Market Aspects, 1983 -June 1986 (Citations 
from the Computer Database), p. 52, July 1986. 

Contains citations concerning market aspects of voice recognition 
technology, discussing applications in manufacturing, finance, 
telecommunications. 

[NTIS 86-3] National Technical Information Service PB86- 871704/WLI, 

Speech Recognition by Computer , October 1981 -July 1986, (Citations from 
the Computer Database), p. 46, July 1986. 

Contains a bibliography of citations concerning research and development 
efforts in the computer recognition of speech signals. 
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[NTIS 86-4] National Technical Information Service PB86-852779/WLI, 

Speech Synthesis and Speech Recognition by Computer, April 1983-1984 
(Citations from the INSPEC: Information Services for the Physics and 
Engineering Communities Database), p. 248, December 1985. 

Provides a bibliography that contains citations concerning the principals, 
designs, development, and various applications of computerized speech 
synthesis and speech recognition. 

[NTIS 87-1] National Technical Information Service PB87-864047/WCC, 

Computer Voice Recognition: Market Aspects, January 1983 to July 1987, 
(Citations from the Computer Database), p. 76, July 1987. 

Includes an updated bibliography of citations concerning market aspects of 
voice recognition technology. 

[O'Neil 82] O'Neil, Edward F., "Voice Entry: Terminals You Can Talk 

To", Data Communications, v. 11, pp. 133+, October 1982. 

Reports on the increasing feasibility of voice-entry technology; presents 
some applications in which voice recognition is used. 

[Ogozalek 86] Ogozalek, V., and Van Praag, J., "Comparison of Elderly and 
Younger Users on Keyboard and Voice Input Computer-Based Composition 
Tasks", S1GCH1 Bulletin ACM, pp. 205-N21 1, April 1986. 

[Osman 83] Osman, G., "An Exchange Protocol for Continuous Speech 

Recognition and Synthesis System", Computer-Aided Design of 
Multivariable Technological Systems, pp. 285-288, Pergamon Press, 
1983. ISBN 0-08-029357-3. 

[Paddock 83] Paddock, Harold E., "Voice Input: A Reality", The Internal 

Auditor, v. 40, pp. 23-26, December 1983. 

Argues that the advantages of automatic speech recognition are so great that 
devices capable of recognizing isolated words or short phrases from a 
vocabulary of between 10 and 30 words are economically practical in some 
applications. 

[Pallett 85] Pallett, D. S., "Performance Assessment of Automatic Speech 

Recognizers", Journal of Research of the National Bureau of Standards , v. 
90, pp. 371-N387, September- October 1985. 
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This paper discusses the factors known to influence the performance of 
automatic speech recognizers and describes test procedures for 
characterizing their performance. It is directed toward all the stakeholders 
in the speech community (researchers, vendors, and users); consequently, 
the discussion of test procedures is not directed toward the needs of specific 
users to demonstrate the performances characteristics of any specific 
algorithmic approach or particular product. It relies significantly on 
contributions from an emerging consensus standards activity, especially 
material developed within the IEEE Working Group on Speech I/O 
Performance Assessment. 

[Pallett86] Pallett, D., Assessing the Performance of Recognizers, pp. 

277-308, McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

[Pay 81] Pay, B. E., and Evans, C. R., "An Approach to the Automatic 

Recognition of Speech", International Journal of Man-Machine Studies 
v. 14, pp. 13-27, January 1981. 

This paper describes some techniques employed at the National Physics 
Laboratory in developing a practical system capable of recognizing human 
speech. The system, which is currently being evaluated in an extended series 
of trials, is capable of performing two main tasks: (1) recognizing key 
words embedded in continuous speech and (2) segmenting and recognizing 
continuous speech such as strings of numerals. 

[Pearkins 84] Pearkins, Jon, "Talking Replaces Keyboarding", Computer 

Data, v. 9, p. 18, March 1984. 

Discusses the changes that will take place in computers and their use in the 
next five to ten years, and emphasizes the need to be aware of these changes 
when doing long-range planning. 

[Peckham83] Peckham, Jeremy, "The Logos Continuous Speech 
Recognition System", Computer Bulletin, pp. 2-3, March 1983. 

Discusses Logos, one of the world's most advanced speech recognition 
systems, which was developed by Logics. 

[Peckham 84] Peckham, J. B., "Speech Recognition— What is it Worth?" 
Proceedings of the 1st International Conference of Speech Technology, p. 
39, October 1984. 

It is often assumed that, since speech is man’s most natural means of 
communicating, it is the ideal medium for communicating with machines. 
This paper addresses the issue of assessing the true worth of speech input in 
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the man-machine interface and proposes transaction time as one objective 
measure. The economics of using speech input technology, related to its 
potential advantage over more traditional tactile input methods and different 
application markets, is also covered. 

[Peckman 86] Peckman, J., Human Factors in Speech Recognition, pp. 172- 
190, McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

[Pfauth 83] Pfauth, M., and Fisher, W. M., "Voice Recognition Enters 

The Control Room", Control Engineering, pp. 147-150, September 1983. 

Voice recognition is at the door waiting to enter the industrial control room. 
As this formerly esoteric technology crosses the threshold from laboratory 
curiosity to practical equipment, otherwise mundane, task-intensive 
workplaces will become exciting, synergistic, and more productive. 

[Philip 87] Philip, George and Young, Elizabeth S., "Man-Machine 

Interaction by Voice: Developments in Speech Technology", Journal of 
Informational Sciences v. 13, n. 1, pp. 3-23, 1987. 

Outlines the limitations of existing means of communications with 
computers and the background to developments in voice input/output 
technology. 

[Pierrel 87] Pierrel, J., Aspects of Man-Machine Voice Dialog, pp. 249- 

274, Cambridge University Press, 1987. ISBN 0-521-30983-2. 

[Pister-Bourjot 87] Pister-Bourjot, C., and Haton, J., "Automatic Learning: An 
Approach to the Adaptation of a Speech Recognition System to One or 
Several Speakers", Speech Communication, v. 6, n. 1, pp. 43-54, 1 

March 1987. 

As part of a system for the automatic recognition of isolated words in a large 
vocabulary on the basis of an analytical approach, we considered the 
automatic speaker-adaptation of the system. This was carried out by means 
of an automatic learning procedure of the speakers' reference patterns, and 
by automatically adjusting the parameters of the system. This learning relies 
on a time alignment algorithm using acoustic-phonetic features which are 
little speaker dependent. The learning session was successfully tested on 18 
speakers out of 20 (10 women and 10 men) and the reference patterns thus 
obtained yielded good results during the recognition phase. We have now 
undertaken an analysis of the vowels by 15 speakers based upon descriptive 
statistics and statistical interpretation in order to design procedures of 
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normalization and of automatic generation of a speaker's vowel reference 
patterns. 

[Pluhar 83] Pluhar, Kenneth, "Speech Recognition- An Exploding Future 

for the Man-Machine Interface", Control Engineering, pp. 70-73, January 
1983. 

Discusses the application of speech recognition systems to industrial control 
problems. 

[Poock 80] National Technology Information Service AD A091055, NPS- 

55-80-016, Experiments with Voice Input for Command and Control: Using 
Voice Input to Operate a Distributed Computer Network, by G. K. Poock, 
April 1980. 

This report describes and experiment in which subjects used voice 
recognition equipment to verbally enter commands to a computer network 
similar to that of a command and control center or shipboard information 
center. 

[Poock 81-1] Poock, G. K., "To Train Randomly or All at Once... That is 

the Question", Proceedings of Voice Data Entry Systems Applications 
Conference, October 1981. (Sponsored by Lockheed Missiles and Space Co., 
Santa Clara, California.) 

[Poock 81-2] National Technology Information Service AD A 102208, NPS- 

55-81-013, A Longitudinal Study of Computer Voice Recognition 
Performance and Vocabulary Size, by G. K. Poock, June 1981. 

This research examined voice recognition performance as a function of time 
and showed no decrement in performance after 21 weeks. In addition, 
vocabulary sizes up to 240 utterances showed stable performance. 

[Poock 83-1] National Technology Information Service AD A130155, 

Voice Recognition Performance With Naive Versus Practiced Users, by G. 
K. Poock and B. J. Martin, June 1983. 

[Poock 83-2] National Technology Information Service, NPS-55-83- 

012PR, Simulated TACFIRE Input Procedure for Use With Voice Data 
Entry, by G. K. Poock and E. F. Roland, April 1983. 

[Poock 83-3] National Technology Information Service AD A 129951, NPS- 

55-83-005, Wearing Army Gas Masks While Talking to a Voice Recognition 
System, by G. K. Poock and E. F. Roland, March 1983. 
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[Poock 83-4] National Technology Information Service NPS-55-83-017PR, 

Final Summary: Voice Recognition! Input Issues for TACFIRE, by G. K. 
Poock and E. F. Roland, March 1983. 

[Poock 83-5] National Technology Information Service AD A 127223 NPS- 

55-83-003, The Effect of Feedback to Users of Voice Recognition 
Equipment , by G. K. Poock and B. J. Martin, February 1983. 

[Poock 83-6] National Technology Information Service AD A 129975, NPS- 

55-83-001, Voice Recognition Vocabulary Lists for the Army's TACFIRE 
System, by G. K. Poock and E. F. Roland, January 1983. 

[Poock 83-7] Poock, G. K., "Speech Recognition Research, Applications 

and International Efforts", Human Factors Society, Spring 1983. 

Discusses a broad overview of the speech I/O industry on a national and 
international level. Within this context, technical and human factors issues 
which are relevant in all countries are discussed. 

[Poock 84] National Technology Information Service AD A142554, 

NPS55-84-002, Effects of Emotional and Perceptual Motor Stress on a 
Voice Recognition System’s Accuracy: An Applied Investigation, by G. K. 
Poock and B. J. Martin, February 1984. . 

[Poock 85] National Technology Information Service, AD A158001, 

NPS55-85-012, An Examination of Some Error Correcting Techniques for 
Continuous Speech Recognition Technology, by G. K. Poock and B. J. 
Martin, June 1985. 

[Poock 86] Poock, G. K., "A Longitudinal Study of Five Year Old Speech 
Reference Patterns", Journal of the American Voice HO Society, v. 3, pp. 
13-18, June 1986. 

[Prasad 87] Prasad, K., and Lamba, T., "Natural Language Interfaces 

Based on Keyboard Extraction Using AWK", Microprocessors & 
Microsystems, pp. 157-160, 1 April 1987. 

[Pursley 85] Pursley, Roy, "Speech Technology-No Longer Small Talk 

for Financial Software Users", Journal of Financial Software, v. 2, pp. 52- 
53, March/April 1985. 

Points out that speech technology as a means of interfacing with a computer 
is particularly well-suited to use in the financial world. 
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[Quarmby 86] Quarmby, D., Silicon Devices for Speech Recognition, pp. 
200-215, McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

[Reardon 87] Reardon, Tracey A., "Talk About Productivity", Words , 

v. 15, pp. 22-23, December/January 1987. 

Discusses PC-based voice recognition and voice response technology and 
how it enhances the way users do business. 

[Rehsoft84] Rehsoft, C., "Voice Recognition at the Ford Warehouse in 

Cologne", Proceedings of the 1st International Conference of Speech 
Technology, p. 103, October 1984. 

Voice recognition has proved to be effective with an online shipping system 
at the Ford parts distribution center in Cologne. As one of the very few 
applications of this technology in Europe this center employs eight parallel 
workstations using voice recognition. This paper describes the system, 
especially the hardware and software used, and deals with ergonomic aspects 
to be observed when introducing voice recognition to the factory floor. The 
emphasis of this description is on the results of the system obtained at Ford 
and the consequences drawn from them for the introduction of voice 
recognition in general. 

[Reuhkala 83] Reuhkala, E., "Recognition of Strings of Discrete Symbols 
With Special Application to Isolated Word Recognition", Acta Polytechnica 
Scandinavica, pp. 1-92, 1983. 

[Rigoll 84] Rigoll, G., "Experiences in Interfacing Voice-Input/Output 

Devices to Host Computers, NC-Machines and Robots", Proceedings of the 
1st International Conference of Speech Technology, p. 93, October 1984. 



The Fraunhofer-Institut fur Arbeitswirtschaft und Organisation (IAO) in 
Stuttgart performs contract research for industry and government. Several 
projects were carried out, concerning the integration of voice-input/output 
equipment into office automation and production systems, using various 
voice-input/output device and chip-sets. Among these projects was the use 
of a voice-input device and a voice output board for NC-machine 
programming, the integration of voice-input technology in quality control. 
The experiences concerning the industrial application of voice-input/output 
technology and the difficulties in interfacing the devices are presented in this 
paper. 
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[Rigsby 82] Rigsby, Mike, Verbal Control With Microcomputers, p. 312, 

Tab Books, 1982. 

Provides an overview of speech and the problem it presents for machine 
recognition and a "hands-on" guide for operating a microcomputer that 
recognizes and responds to voice commands. 

[Roberts 86] Roberts, L., and others, "Improving Speaker Consistency in 

an Automatic Speech Recognition Framework", Computer Speech and 
Language, pp. 61-N93, March 1986. 

[Rollins 83] Rollins, A., Constantine, B., and Baker, S., "Speech 

Recognition at Two Field Sites, Chi ’83", Human Factors in Computing 
Systems, pp. 267-273, 1983. ISBN 0-89791-121-0. 

[Rollins 85] Rollins, A. M., "Speech Recognition and Manner of Speaking 

in Noise and in Quiet", Human Factors in Computing Systems, pp. 197-199, 
14-18 April 1985. ISBN 0-89791-149-0. 

[Ross 84] Ross, Steve, and MacAllister, Jeff, "Practical and Continuous 

Speech Recognition", Computer Design, v. 23, pp. 69+, 15 June 1984. 

Presents a continuous speech recognition system that accepts sentences of 
any length, and permits cost-effective voice-data entry in demanding real- 
world environments. 

[Rossi 83] Rossi, M., Nishinuma, Y., and Mercier, G., "Multi Speaker", 

(FRENCH), Speech Communication, v. 2, n. 2-3, pp. 215-217, July 1983. 

We present an algorithm for the recognition of vowels using acoustic cues 
other than formant values. The acoustic cues presented make use of 
information relative to the spectral or temporal distribution of energy. 
These cues are context-independent and we obtained a mean rate of 
recognition of 92% for several speakers. The most efficient cues were those 
of the features open/close and front/back; the cues of nasality, on the other 
hand, showed greater intersubject variability and defined distinct classes of 
speakers. The context independency of the cues with isolated words leads us 
to expect good results for continuous speech. 

[Saitta 83] Saitta, L., "Experiments in Evidence Composition in a Speech 

Understanding System", International Journal of Man-Machine Studies, 
v.19, pp. 19-31, July 1983. 
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A method for composing partial evidences in pattern recognition problems 
is presented and experimental results, referring to speech understanding, are 
also discussed. 

The method is well suited for real-time problems, where speed and 
parallelism in taking decisions are fundamental requirements. The case 
study presented in the paper is a simple one, for the sake of clarity, but a 
generalization to complex production systems can be easily obtained. 

[Salfer 85] Salfer, D.L., Voice Automation Of Ship Control, Master's 

Thesis, Naval Postgraduate School, Monterey, California, September 1985. 

This thesis explores possible shipboard application of speech recognition 
technology. It includes a detailed analysis of tasks performed on the bridge, 
in the Combat Information Center and in the main engineering control space 
of an FFG-7 Frigate. 

[Santarelli 84] Santarelli, Mary-Beth, "Voice Recognition: Not Just a Lot of 
Talk", Software News, v. 4, pp. 44-45, December 1984. 

Explains that while voice recognition has been successfully used in factories 
for quality assurance and inventory applications, it may not be sophisticated 
enough to be used in the office environment. 

[Scagliola 83-1] Scagliola, C., "Continuous Speech Recognition Without 
Segmentation: Two Ways of Using Diphones as Basic Speech Units", Speech 
Communication, v. 2, n. 2-3, pp. 199-201, July 1983. 

[Scagliola 83-2] Scagliola, C., "Language Models and Search algorithms for 
Real-Time Speech Recognition", International Journal of Man-Machine 
Studies , v. 22, pp. 523-547, 1983. 

In this paper, the "continuous speech recognition" problem is given a clear 
mathematical formulation as the search for that sequence of basic speech 
units that best fits the input acoustic pattern. For this purpose spoken 
language models in the form of hierarchical transition networks are 
introduced, where lower level subnetworks describe the basic units as 
possible sequences of spectral states. The units adopted in this paper are 
either whole words or smaller subword elements, called diphones. The 
recognition problem thus becomes that of finding the best path through the 
network, a task carried out by the linguistic decoder. By using this 
approach, knowledge sources at different levels are strongly integrated. In 
this way, early decision making based on partial information (in particular 
any segmentation operation or the speech/silence distinction) is avoided: 
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usually this is a significant source or errors. Instead, decisions are deferred 
to the linguistic decoder, which possesses all the necessary pieces of 
information. 

The properties that a linguistic decoder must posses in order to operate in 
real-time are listed, and then a best-few algorithm with partial traceback of 
explored paths, satisfying the above requisites, is described. In particular, 
the amount of storage needed is almost constant for any sentence length, and 
the interpretation of early words in a sentence may be possible long before 
the speaker has finished talking. Experimental results with two systems, one 
with words and the other with diphones as basic speech units, are reported. 
Finally, relative merits of words and diphones are discussed, taking into 
account aspects such as the storage and computing time requirements, their 
relative ability to deal with phonological variations and to discriminate 
between similar words, their speaker adaptation capability, and the ease with 
which it is possible to change the vocabulary and the language dependencies. 
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Recognition Based on a Diphone Spotting Approach", Cybernetic Systems: 
Recognition, Learning, Self-Organization, pp. 73-83, Research Studies 
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"Voice Synthesis and Recognition", Mini Systems, v. 15, pp. 146+, 
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[Schmandt 85] Schmandt, C., Voice Communication With Computers, pp. 
133-160, Ablex Publishing Company, 1985. ISBN 0-89381-244-1. 

[Schotola 84] Schotola, T. "On the Use of Demisyllables in Automatic Word 

Recognition", Speech Communication, v. 3, n. 1, pp. 63-87, April 1984. 

This paper describes experiments on automatic speech recognition using 
demisyllables as segmentation units and the consonant clusters contained 
therein as decision units for classification. As compared to the large number 
of different demisyllables, the use of consonant clusters reduces the class 
inventory considerably. In order to test the method, three experiments 
dealing with isolated German words were carried out. In the first 
experiment the syllabic segmentation of words was investigated; in the 
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second experiment the methods for classification of consonant clusters were 
tested. In the third experiment a complete 1000- word recognition system 
was developed which performed the segmentation, the classification of 
consonant clusters and vowels, and a correction of recognition errors by use 
of a phonetic lexicon. Demisyllables segmentation and processing have 
proved suitable, especially for large vocabularies. 

[Scott 83] Scott, Brian L., "Voice Recognition Systems and Strategies", 

Computer Designs, v. 22, pp. 67-70, January 1983. 

Describes word verification as an approach to voice recognition that 
overcomes the processing and memory-intensive demands of large system 
vocabularies. 

[Seaman 82] Seaman, John, "Voice: New Ways With an Old Medium", 

Computer Decisions, v. 14, pp. 62+, March 1982. 

Discusses applications of voice processing and describes voice processing 
equipment for data entry (recognition) and response (synthesis). 

[Seaman 83] Seaman, John, "The Latest Word in Voice Recognition", 

Computer Decision, v. 15, pp. 48+, February 1983. 

Examines the new Votan Model V5000 voice recognition and voice response 
unit. 

[Seaman 85] Seaman, J., Voice: New Ways With an Old Medium, pp. 85- 

91, Havden Book Co., 1985. ISBN 0-8104-6329-6. 

[Senensieb 84] Senensieb, G. A., "Speech Input and Output--A Survey of 
Available Products", Proceedings of the 1st International Conference of 
Speech Technology, p. 57, October 1984. 

The capabilities of current speech input and output technology are explained 
and assessed with reference to a selection of existing products. Included in 
the survey are speech recognition products, single synthesizers, and text-to- 
speech systems. The tangible benefits of applying speech technology are 
summarized and the author's view of a challenge for the future is presented. 

[Shapiro 84] Shapiro, E., "A Business Computer, A Business Program, and 

More on Voice Recognition", Byte, pp. 147-154, February 1984. 

Recent developments raise some questions about perceived industry trends. 
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[Shapiro 85] Shapiro, S. F., "Speech Recognition Produces Natural 

Interface", Computer Design, pp. 59-62, March 1985. 

[Shore 83] Shore, J. E. Burton, "Discrete Utterance Speech Recognition 

Without Time Alignment", IEEE Transactions in Information Theory, pp. 
472-491, July 1983. 

[Silverman 85] Silverman, H. F., "One Architectural Approach for Speech 
Recognition Processors", Algorithmically Specialized Parallel Computers, 
pp. 129-148. Academic Press, Inc., 1985. ISBN 0-12-654130-2. 

[Siroux 85] Siroux, J., and Gillet, D., "A System for Man-Machine 

Communication Using Speech", Speech Communication, v. 4, pp.289- 
315, December 1985. 

KEAL is a continuous speech recognition system developed at the CNET 
laboratory in Lannion (France). Part of the laboratory’s current work aims 
at extending it in the direction of a speech-understanding and man-machine 
dialog system. A question-answer-type dialog is set in motion in order to 
provide the user with information (the current application consists in 
simulating a directory inquiries service). This paper describes how 
syntactic, semantic, and pragmatic knowledge is used for implementing such 
a dialog, and the main advantages and drawbacks of the methods chosen are 
discussed. Sentence recognition is performed by a left-to-right bottom-up 
parser by means of a semantic context-free grammar. Using a method 
analogous to that of semantic attributes, the parse-tree is then interpreted in 
order to obtain a semantic structure which represents the information 
relevant to the subsequent dialog. The dialog manager uses the semantic 
structure for instantiating a model graph, which represents the state the 
dialog at any instant; it indicates the next message to be sent to the user, and 
how to analyze his answer. An example derived from the directory 
inquiries sendee is described. 

[Smith 83] Smith, F. J., and Linggard, R. J., "Information Retrieval by 

Voice Input and Output", Research and Development in Information 
Retrieval, pp. 275-288, Springer- Verlag New York, Inc., 1983. ISBN 0- 
387-11978-7. 

[Smith 84] Smith, Emily T., and Harris, Marilyn A., "More Than a 

Whisper of Hope for Computers You Can Talk To", Business Week, p. 92F- 
H, 17 December 1984. 

Examines the new IBM experimental computer which has a system capable 
of recognizing 5,000 spoken words with 95% accuracy. 



74 



[Spine 84] Spine, T., Williges, B. H., and Maynard, J. F., "An Economical 
Approach to Modeling Speech Recognition Accuracy", International 
Journal of Man-Machine Studies , v. 21, pp. 191-202, September 1984. 

Accuracy of speech recognizer decisions is an important criterion for 
maintaining both system effectiveness and user satisfaction. A central- 
composite design methodology is recommended as an economical means to 
develop empirical prediction equations for speech recognizer performance 
incorporating a number of influential factors. Factors manipulated in the 
central-composite design included number of training passes, reject 
threshold, difference score, and size of the active vocabulary. The factorial 
combination of two noncontinuous variables, sex of the speaker and inter- 
word confusability, was also investigated by replicating the central- 
composite design to create four sets of data. Standard least-squares multiple 
regression analysis was used to develop the four sets of prediction equations, 
each of which accounted for at least 50% of the variance in recognizer 
performance. A cross-validation study revealed that shrinkage was not 
excessive. Subsequently, these empirical models were incorporated into an 
interactive design tool for a dialogue author where the percentage of correct 
recognition is automatically optimized when the dialogue author enters the 
size of the vocabulary to be used or both the vocabulary size and desired 
number of training passes. The design tool can also be used to make 
predictions anywhere within the response surface. Use of these efficient 
data collection procedures along with the interactive design tool should 
greatly assist the dialogue author in predicting the impact of various 
language, task, environmental, algorithmic, human, and performance 
evaluation factors on speech recognition accuracy. 

[Stephens 83] Stephens, Ron, "Make the Way for Another Revolution", 

Modern Offices, v. 28, pp. 96+, October 1983. 

Suggests that many of the current methods of communicating and 
manipulating information which have traditionally been dependent on 
keyboard entry, may soon be replaced by voice-based procedures, causing a 
major transformation with the automated office. 

[Strat Inc 81] Voice Input/Output: Markets, Technologies & Applications , 

p. 110, Strategic Inc., 1981. 

Analyzes the advantages of voice I/O, states of the market technology trends 
in speech synthesis, future applications, voice response, text-to-voice, 
language translations, aids to handicapped and computer output. Electronic 
voice mail, dictation/word processing, computer I/O automation, games, 
etc., also are included. 
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[Sweeney 86] Sweeney, M. J., and Bitar, K. J., An Analysis of Friendly 

Input Devices for the Control of the Naval Warfare Interactive Simulation 
System, Master's Thesis, Naval Postgraduate School, Monterey, California, 
March 1986. AD S9333. 

This thesis describes an experiment conducted at the Naval Postgraduate 
School (NPS) during the period 15 October through 28 October 1985. 
Specifically, the experiment evaluates "pull-down window" micro-computer 
technology, continuous speech recognition equipment, and standard 
computer keyboard entry to input commands and control environment. 
Using the Naval Warfare Interactive Simulation System (NWISS) as a 
controlled medium, military problems were posed to test subjects in specific 
light and noise environments. Although the results are not entirely 
conclusive, they do demonstrate a distinct advantage in using continuous 
speech or keyboard entry modes over the drop-down window technology of 
the Macintosh (if subject training time is not a significant restriction). 
Either the continuous speech or the keyboard method was clearly superior in 
all environments. 

[Taggart 81] National Technical Information Service AD-A105 568, Voice 

Recognition as an Input Modality for the TACCO Preflight Data Insertion 
Task in the P-3C Aircraft, by John Laughlin Taggart and Charles Darwin 
Wolfe, Jr., p. 150, March 1981. 

Reports the results of an experiment to compare accuracy and entry speed 
capabilities of a standard keyboard with the Threshold Technology T-600 
voice recognition unit in the performance of an operational data entry task 
in the P-3C aircraft. 

[Tanaka 83] Tanaka, A., and others, "A Study of the Syllable Oriented 

Recognition of Continuous Speech", Speech Communication, v. 2, n. 2-3, pp. 
207-210, July 1983. 

[Taylor 86] Taylor, M., Voice Input Applications in Aerospace, pp. 322- 

337, McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

[Tecosky 86] Tecosky, T., Interfacing Standards for Recognizers, pp. 244- 

255, McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

[Teja 83] Teja, E. R., and Gonnella, G., Voice Recognition Technology, 

p. 212, Reston Publishing Co., 1983. ISBN 0835984176. 
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[Thompson 84] Thompson, H., "Artificial Intelligence and Speech Processing: 
The Good News and the Bad News", Proceedings of the 1st International 
Conference of Speech Technology , p. 217, October 1984. 

Discusses author’s expectations about the contributions we can and cannot 
expect from Artificial Intelligence to Speech Processing over the next few 
years. 

[Thompson 85] Thompson, Linde, "Voice Recognition Systems: A Sound 
Investment in the Future", News 34-38, pp. 59+, March 1985. 

Looks at the present and the future uses of voice recognition. 

[Tyler 86] Tyler, J., "Speech Recognition System Using Walsh Analysis 

and Dynamic Programming", Microcomputers & Microsystems, pp. 427- 
N433, October 1986. 

[Underwood 84] Underwood, M. J., "Human Factors Aspects of Speech 
Technology", Proceedings of the 1st International Conference of Speech 
Technology , p. 223, October 1984. 

Regards speech technology as a means to an end, and not an end in itself. 
Discusses the human component in the speech technology system and its 
importance. 

[Viglione84] Viglione, S. S., "Trends in Development of Speech 

Recognition Systems", Proceedings of the 1st International Conference of 
Speech Technology , p. 169, October 1984. 

Discusses the inherent superiority of speech over other modes of human 
communications and the growing need for better control of complex 
machines. Discusses the major role of man-machine communication 
through the use of speech recognition and speech response systems. 

[Viglione 86] Viglione, S., Recognition Past and Future, pp. 373-387, 

McGraw-Hill Inc., 1986. ISBN 0-07-007913-7. 

Discusses the inherent superiority of speech over other modes of human 
communication and the growing need for better control of complex 
machines, discusses the major role of man-machine command through the 
use of speech recognition and speech response systems. 

[Visser 87] Visser, Roger, "Voice Recognition Fills Technical Barriers", 

Manufacture Engineering, v. 98, pp. CT-24 to CT-26, May 1987. 
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Discusses voice recognition, the technology which allows people to interact 
with computers using voice instead of keyboards and terminals and which 
has been successfully implemented by numerous manufacturers from steel 
and car makers to circuit board designers. 

[Wagner 87] Wagner, M., "A Speech Recognition Experiment With the 

Entire Syllable Inventory of Standard Chinese", Speech Communication , v. 
6, pp. 363-369, 1 December 1987. 

This paper explores the possibility of using automatic speech recognition as 
a front end to a computer for Chinese character processing. A speech 
recognition experiment has been performed with the complete inventory of 
second-tone syllables of Standard Chinese. Two recordings of this 
inventory, which were made 48 hours after one another, were used as test 
and reference sets. It is shown that the distribution of intrasyllable distances 
and the distribution of intersyllable distances overlap considerably for the 
full inventory of 260 second-tone syllables. The recognition rate was 
determined as a function of the syllable size and is 47.3% for the complete 
syllable inventory. 

[Watrous 85] Watrous, Raymond, "Speech Input/Output: Support for 

Integration," Journal of Computer-Integrated Manufacturing Management, 
v. 1, pp. 37-44, Spring 1985. 

Describes the current status of speech I/O technology and defines some of 
the terminology associated with the technology followed by a discussion of 
the technology's advantages and successful use. 

[Wetterlind 86] Wetterlind, Peter James, "A Speech Error Correction 
Algorithm for Natural Language Input Processing", Computer Science, v. 
17, p. 300, 1986, UMI order number: AD A86-25455. 

This research experiment consisted of construction of a system for 
identifying a natural language sentence using only speaker independent 
phonemes as the input. The motivating hypothesis for the experiment is that 
spoken sentences can be recognized from limited phoneme input. The 
research system accepts only strings of consonant phonemes, which are 
recognizable in a speaker independent environment. The original 'spoken' 
sentence is reproduced from the consonant phonemes and formatted as a 
word sequence for subsequent transmission to a natural language processing 
system. The system uses a vocabulary of general words and an expandable 
dictionary of domain specific words during the sentence recognition 
process. 
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[White 84] White, G. M., "Speech Recognition: An Idea Whose Time is 

Coming", Byte, pp. 213-225, January 1984. 

Some theoretical and practical aspects of this emerging technology are 
presented. 

[Wilson 84] Wilson, J., "Where Do We Go from Here?", Proceedings of 

the 1st International Conference of Speech Technology , p. 181, October 
1984. 

Discusses the background and evolution of future speech technology 
products and services. 

[Williams 85] Williams, John M., "Computer Knows its Programmer’s 
Voice", Government Computer News, v. 4, p. 32, 5 July 1985. 

Discusses a quadraplegic's voice recognition system which allows him to 
perform the same tasks as other computer programmers. 

[Withers 83] Withers, S. J., "Voice Control of an Interactive Simulation", 

Simulation, pp. 28-29, January 1983. 

A low cost, microcomputer-based voice recognition device makes a 
convenient input channel for an interactive model of a manufacturing 
system. The problems with current hardware are its limited capabilities and 
unreliable operation. However, the potential exists for useful voice control 
of simulations in the near future. 

[Wood 86] Wood, Lamont, "Voices in the Wilderness", Computer 

Decisions, v. 18, pp. 34+, 8 April 1986. 

States that voice recognition is a long way from becoming a widely accepted 
office technology but, nevertheless, today's voice recognition systems do 
have valuable applications, especially on the shop floor and in the 
warehouse. 

[Woods 85] Woods, Tom, "Computers Learn to Listen", Business 

Computer Systems, v. 4, pp. 80+, March 1985. 

Suggests that today"s pioneering speech recognition products provide a 
glimpse of the exciting technologies and diverse business applications soon 
to come. 

[Wyatt 85] Wyatt, Jim, and Elbon, Dave, "Computers That Listen and 

Talk", Cause/EJfect, v. 8, pp. 9+, July 1985. 
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Points out that when considering voice input/output, the terms voice storage 
and playback, voice recognition, and voice synthesis can be used to 
characterize tasks being performed, and explains. 

[Yalabik84] Yalabik, N., and Unal, F., "An Efficient Algorithm for 

Recognizing Isolated Turkish Words", New Systems and Architectures for 
Automatic Speech Recognition and Synthesis, pp. 419-426, 2-14 July 1984 . 

[Yannakoudakis 85] Yannakoudakis, E. J., "Voice I/O: Problems and 
Perspectives", Computer Bulletin, v. 1, pp.10-12, September 1985. 

Discusses one University's approach to computer voice I/O with the play- 
back or recognition of speech units through the application of rules in an 
algorithmic manner. 4 references. 

[Yellen83] Yellen, H. W., A Preliminary Analysis of Human Factors 

Affecting the Recognition Accuracy of a Discrete Word Recognizer for C3 
System, Master's Thesis, Naval Postgraduate School, Monterey, California, 
March 1983. AD A128546. 

Literature pertaining to voice recognition abounds with information 
relevant to the assessment to transitory speech recognition devices. In the 
past, engineering requirements have dictated the path this technology 
followed. But, other factors do exist that influence recognition accuracy. 
This thesis explores the impacts of human factors on the successful 
recognition of speech, principally addressing the differences or variability 
among users. A Threshold Technology T-600 was used for a 100 utterance 
vocabulary to test 44 subjects. A statistical analysis was conducted on five 
generic categories of human factors: occupational, operational, 
psychological, physiological, and personal. How the equipment is trained 
and the experience level of the speaker were found to be key characteristics 
influencing recognition accuracy. To a lesser extent computer experience, 
time of week, accent, vital capacity and rate of air flow, speaker 
cooperativeness, and anxiety were found to affect overall error rate. 

[Zue 83] Zue, V. W., "The Use of Phonetic Rules in Automatic Speech 

Recognition", Speech Communication, v. 2, n. 2-3, pp. 181-186, July 1983. 

[Zue 84] Zue, V. W., and Huttenlocher, D. P., "Computer Recognition 

of Isolated Words from Large Vocabularies: Lexical Access Using Partial 
Phonetic Information", Institute of Information Science, pp. 343-347, 1984 
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[Poock 80] 


[Dabbagh 86] 


[Martin 86] 


[Poock 83-2] 


[EDP Anal 83] 


[Masc arenas 84] 


[Poock 83-3] 


[Elenius 86] 


[Meloni 83] 


[Poock 83-4] 


[Elster 80] 


[Menke 87] 


[Poock 83-6] 


[Eskenazi 83] 


[Mokhoff 84] 


[Poock 83-7] 


[Fallside 85] 


[Moody 85] 


[Poock 84] 


[Fallside 86] 


[Myers 83] 


[Prasad 87] 


[Ford 83] 


[Neil 81] 


[Pursley 85] 


[Foster 82] 


[Niemann 85] 


[Rehsoft 84] 


[Friedman 84] 


[NTIS 86-1] 


[Rollins 85] 


[Good 84] 


[NHS 86-2] 


[Ross 84] 


[GovDatSys 86] 


[NHS 86-3] 


[Salfer 85] 


[Green 83] 


[NTIS 86-4] 


[Santarelli 84] 


[Green 85] 


[NHS 87-1] 


[Schalk 83] 


[Hager 86] 


[O’Neil 82] 


[Schmandt 85] 


[Hobbs 84] 


[Ogozalek 86] 


[Seaman 82] 
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[Seaman 83] 


[Sweeney 86] 


[Visser 87] 


[Seaman 85] 


[Taylor 86] 


[Wagner 87] 


[Senensieb 84] 


[Tecosky 86] 


[Watrous 85] 


[Shapiro 84] 


[Teja 83] 


[White 84] 


[Shapiro 85] 


[Thompson 84] 


[Wood 86] 


[Siroux 85] 


[Thompson 85] 


[Woods 85] 


[Smith 83] 


[Underwood 84] 


[Wyatt 85] 


[Smith 84] 


[Viglione 84] 


[Yalabik 84] 


[Stephens 83] 


[Viglione 86] 


[Yellen 83] 


SECTION 2. 


MULTILINGUAL FACTORS 




[Eskenazi 83] 


[Niemann 85] 


[Wagner 87] 


[Meloni 83] 


[Pister-Bourjot 87] 


[Yalabik 84] 


[Neil 81] 


[Prasad 87] 




SECTION 3. 


MULTICULTURAL FACTORS 




[Eskenazi 83] 


[Ogozalek 86] 


[Salfer 85] 


[Meloni 83] 


[Pister-Bourjot 87] 


[Wagner 87] 


[Neil 81] 


[Prasad 87] 


[Yalabik 84] 


[Niemann 85] 






SECTION 4. 


COMMAND AND CONTROL ENVIRONMENTS 


[Cerf-Danon 87] [Pfauth 83] 


[Poock 83-4] 


[Hobbs 84] 


[Pister-Bourjot 87] 


[Poock 83-6] 


[LeFever 87] 


[Pluhar 83] 


[Salfer 85] 


[Neil 81] 


[Poock 80] 


[Sweeney 86] 


[Niemann 85] 


[Poock 83-2] 


[Yellen 83] 


SECTION 5. 


HIGH NOISE ENVIRONMENTS 




[Elster 80] 


[Pluhar 83] 


[Rehsoft 84] 


[Martin 84] 


[Poock 83-3] 


[Rollins 85] 


[Pfauth 83] 


[Poock 84] 





SECTION 6. LOW-LIGHT ENVIRONMENTS 
[Salfer 85] 
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APPENDIX C3 SITUATIONAL 



SECTION 1. SITUATIONAL FACTORS 
[Bakst 87] 



[Blunden 80] 
[Bristow 86-1] 
[Bristow 86-2] 
[Brown 87] 
[Bruce 82] 

[Cater 84] 
[Cavazza 84] 
[Cerf-Danon 87] 
[Clements 87] 
[Cochran 83] 
[Cole 85] 
[Connolly 86] 
[Conrad 83] 
[Dabbagh 86] 
[Damper 84] 
[Damper 85] 
[EDP Anal 83] 
[Elenius 86] 
[Elster 80] 
[Eskenazi 83] 
[Fallside 85] 
[Fallside 86] 
[Fisher 86] 

[Ford 83] 

[Foster 82] 
[Friedman 84] 
[Good 84] 



[GovDatSys 86] 
[Green 83] 
[Green 85] 
[Hager 86] 

[Hill 86] 

[Hunter 85] 

[Int Res Dev 85] 
[Int Res Dev 87] 
[Ivall 86-1] 
[Ivall 86-2] 
[Joost 83] 
[Kohonen 85] 
[Kurzweil 86] 
[Lea 86] 
[LeFever 87] 
[Leggett 82] 
[Llaurado 82] 
[Maenobu 84] 
[Martin 86] 
[Mascarenas 84] 
[Menke 87] 
[Mokhoff 84] 
[Moody 85] 
[Myers 83] 

[Neil 81] 

[NTIS 86-1] 
[NTIS 86-2] 
[NTIS 86-3] 



FACTORS 



[NTIS 86-4] 

[NTIS 87] 

[O'Neil 82] 
[Paddock 83] 
[Pallett 85] 

[Pallett 86] 
[Pearkins 84] 
[Peckham 83] 
[Peckman 86] 
[Philip 87] 

[Pierrel 87] 
[Pister-Bourjot 87] 
[Pluhar 83] 

[Poock 80] 

[Poock 83-7] 
[Poock 84] 

[Prasad 87] 
[Pursley 85] 
[Rehsoft 84] 

[Salfer 85] 
[Santarelli 84] 
[Schalk 83] 
[Schmandt 85] 
[Seaman 82] 
[Seaman 83] 
[Seaman 85] 
[Senensieb 84] 
[Shapiro 84] 
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[Shapiro 85] 
[Siroux 85] 
[Smith 83] 
[Stephens 83] 
[Taylor 86] 
[Tecosky 86] 
[Teja 83] 



[Thompson 84] 
[Thompson 85] 
[Underwood 84] 
[Viglione 84] 
[Viglione 86] 
[Visser 87] 



[Watrous 85] 



[White 84] 



[Williams 85] 



[Wood 86] 
[Woods 85] 
[Wyatt 85] 
[Yellen 83] 



SECTION 2. MULTIUSER OR GROUP USAGE 



[LeFever 87] 

SECTION 3. INDIVIDUAL USAGE 
[Pister-Bourjot 87] 

[Hill 86] 

SECTION 4. HANDICAP SITUATIONS 
[Damper 84] [Fisher 86] 

[Damper 85] [Kurzweil 86] 



[Cerf-Danon 87] 



[Maenobu 84] 

[Neil 81] 

[Pister-Bourjot 87] 
[Pluhar 83] 



[Poock 80] 
[Prasad 87] 
[Salfer 85] 
[Yellen 83] 



[Connolly 86] 
[Eskenazi 83] 
[Kohonen 85] 
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APPENDIX C4 QUANTITATIVE FACTORS 



SECTION 1. QUANTITATIVE FACTORS 

[Anatharaman 86] [Gould 83] [Moody 85] 

[Myers 83] 



[Baker 84] 
[Bisiani 84] 
[Blunden 80] 
[Bristow 86-1] 
[Bristow 86-2] 
[Brown 87] 
[Bruce 82] 
[Calcaterra 82] 
[Cater 84] 
[Cavazza 84] 
[Clements 87] 
[Cochran 83] 
[Cole 85] 
[Conrad 83] 
[Dabbagh 86] 
[Dillman 84] 
[EDP Anal 83] 
[Elenius 86] 
[Elster 80] 
[Epstein 86] 
[Fallside 85] 
[Fallside 86] 
[Ford 83] 
[Foster 82] 
[French 83] 
[Friedman 84] 
[Good 84] 



[GovDatSys 86] 
[Green 83] 

[Green 85] 
[Gubrynowicz 84] 
[Hager 86] 
[Harrison 84] 

[Hill 86] 

[Hobbs 84] 

[Hunter 85] 

[Int Res Dev 85] 
[Int Res Dev 87] 
[Ivall 86-1] 

[IvaU 86-2] 
[Johnson 86] 

[Joost 83] 

[Koelsch 87] 
[Kurzweil 86] 

[Lea 86] 

[LeFever 87] 
[Leggett 82] 
[Llaurado 82] 
[Lombardo 84] 
[Martin 84] 
[Martin 86] 

[Masc arenas 84] 
[Meisel 84] 
[Mokhoff 84] 



[NTIS 86-1] 
[NTIS 86-2] 
[NTIS 86-3] 
[NTIS 86-4] 
[NTIS 87-1] 
[O'Neil 82] 
[Paddock 83] 
[PaUett 85] 
[PaUett 86] 
[Pear kins 84] 
[Peckham 83] 
[Peckman 86] 
[Pfauth 83] 
[Philip 87] 
[Pierrel 87] 
[Pluhar 83] 
[Poock 83-1] 
[Poock 83-7] 
[Poock 84] 
[Poock 85] 
[Pursley 85] 
[Reardon 87] 
[Rehsoft 84] 
[Saitta 83] 
[Santarelli 84] 
[Schalk 83] 
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[Schmandt 85] 


[Smith 83] 


[Underwood 84] 


[Scott 83] 


[Smith 84] 


[Viglione 84] 


[Seaman 82] 


[Stephens 83] 


[Viglione 86] 


[Seaman 83] 


[Sweeney 86] 


[Visser 87] 


[Seaman 85] 


[Taylor 86] 


[Watrous 85] 


[Senensieb 84] 


[Tecosky 86] 


[White 84] 


[Shapiro 84] 


[Teja 83] 


[Wood 86] 


[Shapiro 85] 


[Thompson 84] 


[Woods 85] 


[Siroux 85] 


[Thompson 85] 


[Wyatt 85] 


SECTION 2. TIME 


[Anatharaman 86] 


[Dillman 84] 


[Hill 86] 


[Brown 87] 


[Epstein 86] 


[Scott 83] 


SECTION 3. ACCURACY 




[Calcaterra 82] 


[Elster 80] 


[Koelsch 87] 


[Dillman 84] 


[French 83] 


[Meisel 84] 


SECTION 4. SPEED OF ENTRY 




[Anatharaman 86] 


[Dillman 84] 


[Meisel 84] 


[Bisiani 84] 


[Hill 86] 


[Sweeney 86] 



SECTION 5. EASE OF USE 
[Epstein 86] 

SECTION 6. PRODUCTIVITY 
[Hager 86] 

[Pfauth 83] 

[Reardon 87] 
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APPENDIX C5 TRAINING 



SECTION 1. TRAINING FACTORS 
[Anisworth 84] 

[Baker 84] 



[Banatre 83] 
[Biermann 85-2] 
[Blunden 80] 
[Bridle 83] 
[Bristow 86-1] 
[Bristow 86-2] 
[Brown 87] 
[Bruce 82] 
[Calcaterra 82] 
[Cater 84] 
[Cavazza 84] 
[Cerf-Danon 87] 
[Clements 87] 
[Cochran 83] 
[Cole 85] 
[Connolly 86] 
[Conrad 83] 
[Cook 85] 
[Dabbagh 86] 
[Damper 85] 

[De Mori 84] 

[De Mori 85-1] 
[De Mori 85-3] 
[DI Martino 84] 
[EDP Anal 83] 
[Elenius 86] 



[Elster 80] 

[Epstein 86] 
[Fallside 85] 
[Fallside 86] 

[Ford 83] 

[Foster 82] 

[French 83] 
[Friedman 84] 
[Frison 84-1] 
[Frison 84-2] 
[Good 84] 
[GovDatSys 86] 
[Green 83] 

[Green 85] 
[Gubrynowicz 84] 
[Hager 86] 
[Harrison 84] 
[Hobbs 84] 
[Howell 83] 

[Hunt 83] 

[Hunter 85] 

[Int Res Dev 85] 
[Int Res Dev 87] 
[Ivall 86-1] 

[Ivall 86-2] 
[Johnson 85] 
[Johnson 86] 

[Joost 83] 



FACTORS 



[Kurzweil 86] 

[Lea 86] 

[Leggett 82] 
[Levinson 86] 
[Llaurado 82] 
[Lombardo 84] 
[Longuet-Higgins 85] 
[Mackie 87] 
[Maenobu 84] 

[Martin 86] 

[Masc arenas 84] 
[Mavaddat 85] 
[Meade 85] 

[Meisel 84] 

[Meloni 83] 

[Meloni 87] 

[Menke 87] 

[Mokhoff 84] 

[Moody 85] 

[Moore 84-1] 

[Moore 84-2] 

[Myers 83] 
[Nakagawa 84] 
[Niemann 85] 
[Nishida 86] 
[Nocerino 85] 

[NTIS 86-1] 

[NTIS 86-2] 
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[NTIS 86-3] 


[Pursley 85] 


[Smith 84] 


[NTIS 86-4] 


[Rehsoft 84] 


[Spine 84] 


[NTIS 87-1] 


[Reuhkala 83] 


[Stephens 83] 


[O’Neil 82] 


[Roberts 86] 


[Sweeney 86] 


[Ogozalek 86] 


[Rollins 85] 


[Tanaka 83] 


[Osman 83] 


[Ross 84] 


[Taylor 86] 


[Paddock 83] 


[Rossi 83] 


[Tecosky 86] 


[Pallett 85] 


[Salfer 85] 


[Teja 83] 


[Pallett 86] 


[Santarelli 84] 


[Thompson 84] 


[Pay 81] 


[Scagliola 83-2] 


[Thompson 85] 


[Pearkins 84] 


[Scagliola 84] 


[Underwood 84] 


[Peckham 83] 


[Schalk 83] 


[Viglione 84] 


[Pec km an 86] 


[Schmandt 85] 


[Viglione 86] 


[Philip 87] 


[Schotola 84] 


[Visser 87] 


[Pierrel 87] 


[Scott 83] 


[Watrous 85] 


[Pister-Bourjot 87] [Seaman 82] 


[Wetterlind 86] 


[Pluhar 83] 


[Seaman 83] 


[White 84] 


[Poock 81-1] 


[Seaman 85] 


[Williams 85] 


[Poock 81-2] 


[Senensieb 84] 


[Wood 86] 


[Poock 83-1] 


[Shapiro 84] 


[Woods 85] 


[Poock 83-3] 


[Shapiro 85] 


[Wyatt 85] 


[Poock 83-5] 


[Shore 83] 


[Yellen 83] 


[Poock 83-7] 


[Siroux 85] 


[Zue 83] 


[Poock 84] 
[Poock 85] 


[Smith 83] 


[Zue 84] 


SECTION 2. 
[Cook 85] 
[Epstein 86] 


SPEAKER DEPENDENT SYSTEMS 
[Pister-Bourjot 87] 

[Rossi 83] 


SECTION 3. 


SPEAKER INDEPENDENT 


SYSTEMS 


[Anisworth 84] 


[Maenobu 84] 


[Pister-Bourjot 1 


[Connolly 86] 


[Menke 87] 


[Rossi 83] 
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SECTION 4. CONTINUOUS SPEECH RECOGNITION 



[Banatre 83] 


[Lombardo 84] 


[Osman 83] 


[Bridle 83] 


[Maenobu 84] 


[Pay 81] 


[Connolly 86] 


[Meisel 84] 


[Poock 85] 


[De Mori 85-3] 


[Meloni 87] 


[Ross 84] 


[DI Martino 84] 


[Moore 84-1] 


[Rossi 83] 


[Frison 84-1] 


[Moore 84-2] 


[Tanaka 83] 


[Frison 84-2] 


[Nakagawa 84] 


[Zue 83] 


[Hunt 83] 


[Niemann 85] 





SECTION 5. DISCRETE SPEECH RECOGNITION 
[French 83] 

[Reuhkala 83] 

[Shore 83] 

SECTION 6. RECOGNITION ACCURACY 



[Calcaterra 82] 


[Meade 85] 


[Scagliola 84] 


[Elster 80] 


[Meloni 83] 


[Schotola 84] 


[French 83] 


[Nishida 86] 


[Scott 83] 


[Gubrynowicz 84] 


[Nocerino 85] 


[Smith 84] 


[Howell 83] 


[Poock 81-1] 


[Spine 84] 


[Levinson 86] 


[Poock 85] 


[Tanaka 83] 


[Longuet-Higgins 85] 


[Roberts 86] 


[Wetterlind 86] 


[Mackie 87] 


[Rollins 85] 


[Yellen 83] 


[Maenobu 84] 


[Scagliola 83-2] 


[Zue 83] 



[Mavaddat 85] 



APPENDIX C6 HOST COMPUTER FACTORS 



SECTION 1. HOST COMPUTER FACTORS 



[Armstrong 80] 


[Ford 83] 


[Martin 86] 


[Bakst 87] 


[Foster 82] 


[Masc arenas 84] 


[Banatre 83] 


[Friedman 84] 


[Meisel 84] 


[Blunden 80] 


[Good 84] 


[Menke 87] 


[Bridle 87] 


[Gould 83] 


[Mod Mat 83] 


[Bristow 86-1] 


[GovDatSys 86] 


[Mokhoff 84] 


[Bristow 86-2] 


[Green 83] 


[Moody 85] 


[Brown 87] 


[Green 85] 


[Murveit 83] 


[Bruce 82] 


[Haas 84] 


[Myers 83] 


[Calcaterra 82] 


[Hager 86] 


[NTIS 86-1] 


[Cashen 86] 


[Hill 86] 


[NTIS 86-2] 


[Cater 84] 


[Hunter 85] 


[NTIS 86-3] 


[Cavazza 84] 


[Int Res Dev 85] 


[NTIS 86-4] 


[Clements 87] 


[Int Res Dev 87] 


[NTIS 87-1] 


[Cochran 83] 


[Ivall 86-1] 


[O’Neil 82] 


[Cole 85] 


[Ivall 86-2] 


[Ogozalek 86] 


[Conrad 83] 


[Jinper 85] 


[Paddock 83] 


[Cook 85] 


[Joost 83] 


[Pallett 85] 


[Dabbagh 86] 


[Keller 85] 


[Pallett 86] 


[De Mori 85-2] 


[Koelsch 87] 


[Pearkins 84] 


[De Mori 85-3] 


[Korzeniowski 86] 


[Peckham 83] 


[Dillman 84] 


[Kurzweil 86] 


[Peckman 86] 


[EDP Anal 83] 


[Lea 86] 


[Philip 87] 


[Elenius 86] 


[Leggett 82] 


[Pierrel 87] 


[Elster 80] 


[Llaurado 82] 


[Pluhar 83] 


[Epstein 86] 


[Lombardo 84] 


[Poock 80] 


[Fallside 85] 


[Madron 84] 


[Poock 83-7] 


[Fallside 86] 


[Mariani 83] 


[Pursley 85] 
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[Rehsoft 84] 


[Shapiro 85] 


[Underwood 84] 


[Rigoll 84] 


[Silverman 85] 


[Viglione 84] 


[Rigsby 82] 


[Siroux 85] 


[Viglione 86] 


[Santarelli 84] 


[Smith 83] 


[Visser 87] 


[Schalk 83] 


[Stephens 83] 


[Watrous 85] 


[Schmandt 85] 


[Sweeney 86] 


[White 84] 


[Seaman 82] 


[Taylor 86] 


[Wood 86] 


[Seaman 83] 


[Tecosky 86] 


[Woods 85] 


[Seaman 85] 


[Teja 83] 


[Wyatt 85] 


[Senensieb 84] 


[Thompson 84] 


[Zue 83] 


[Shapiro 84] 


[Thompson 85] 




SECTION 2. 


MICROCOMPUTERS 




[Calcaterra 82] 


[Haas 84] 


[Lombardo 84] 


[Dabbagh 86] 


[Hill 86] 


[Madron 84] 


[Elenius 86] 


[Jinper 85] 


[Mariani 83] 


[Epstein 86] 


[Keller 85] 


[Murveit 83] 


[Friedman 84] 


[Koelsch 87] 


[Rigsby 82] 


[Good 84] 


[Korzeniowski 86] 


[Sweeney 86] 


SECTION 3. 


MAINFRAMES 




[Calcaterra 82] 


[Cashen 86] 




SECTION 4. 


NETWORKS 




[Banatre 83] 


[De Mori 85-3] 




[Bridle 87] 


[Poock 80] 




SECTION 5. 


TYPE OF ENTRY REQUIRED 




[Armstrong 80] 


I [Cook 85] 


[Meisel 84] 


[Bakst 87] 


[Hill 86] 


[Pluhar 83] 


[Cochran 83] 


[Koelsch 87] 
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APPENDIX C7 EXPERIMENTS AND RESEARCH 



SECTION 1. EXPERIMENTS AND RESEARCH 



[Allen 83] 
[Anatharaman 86] 
[Andrews 84] 
[Anisworth 84] 
[Armstrong 81] 
[Baker 84] 

[Bakst 87] 
[Banatre 83] 
[Berman 84] 
[Betterton 83] 
[Bierfert 85] 
[Biermann 84] 
[Biermann 85-1] 
[Biermann 85-2] 
[Bisiani 84] 
[Blunden 80] 
[Bridle 82] 

[Bridle 83] 

[Bridle 84] 

[Bridle 87] 
[Bristow 86-1] 
[Bristow 86-2] 
[Bronson 85] 
[Brown 87] 

[Bruce 82] 
[Calcaterra 82] 
[Cashen 86] 

[Cater 84] 



[Cavazza 84] 
[Cerf-Danon 87] 
[Clements 87] 
[Cochran 83] 
[Cole 85] 
[Connolly 86] 
[Conrad 83] 
[Cook 85] 
[Dabbagh 86] 
[Damper 84] 
[Damper 85] 

[De Mori 84] 

[De Mori 85-1] 
[De Mori 85-2] 
[De Mori 85-3] 
[De Mori 87-1] 
[De Mori 87-2] 
[Dillman 84] 

[DI Martino 84] 
[EDP Anal 83] 
[Elenius 86] 
[Elster 80] 
[Epstein 86] 
[Eskenazi 83] 
[Fallside 85] 
[Fallside 86] 
[Ford 83] 

[Foster 82] 



[Friedman 84] 
[Frison 84-1] 
[Frison 84-2] 
[Good 84] 

[Gould 83] 
[GovDatSys 86] 
[Green 83] 

[Green 85] 
[Gubrynowicz 84] 
[Haas 84] 

[Hager 86] 

[Haton 85] 

[Haton 87] 

[Henkle 83] 

[Hill 86] 

[Hobbs 84] 
[Howell 83] 

[Hunt 83] 

[Hunter 85] 

[Int Res Dev 80] 
[Int Res Dev 85] 
[Int Res Dev 87] 
[Ivall 86-1] 

[Ivall 86-2] 

[Jinper 85] 
[Johnson 85] 
[Johnson 86] 

[Joost 83] 
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[Keller 85] 

[Koelsch 87] 
[Kohonen 85] 
[Korzeniowski 86] 
[Kurzweil 86] 

[Kuzela 86] 

[Lea 86] 

[LeFever 87] 

[Leggett 82] 
[Levinson 86] 
[Llaurado 82] 
[Lombardo 84] 
[Longuet-Higgins 85] 
[Lundquist 82] 
[Mackie 87] 

[Madron 84] 
[Maenobu 84] 
[Mariani 83] 

[Martin 84] 

[Martin 86] 

[Masc arenas 84] 
[Mavaddat 85] 
[McCracken 81] 
[Meade 85] 

[Meisel 84] 

[Meisel 86] 

[Meloni 83] 

[Meloni 87] 

[Menke 87] 

[Minault 87] 

[Mod Mat 83] 
[Mokhoff 84] 

[Moody 85] 

[Moore 84-1] 



[Moore 84-2] 
[Murveit 83] 
[Myers 83] 
[Nakagawa 84] 
[Neil 81] 

[Niemann 84] 
[Niemann 85] 
[Nishida 86] 
[Nocerino 85] 
[NTIS 81] 

[NTIS 86-2] 

[NTIS 86-3] 

[NTIS 86-4] 

[NTIS 86-1] 

[NTIS 87-1] 
[O'Neil 82] 
[Ogozalek 86] 
[Osman 83] 
[Paddock 83] 
[Pallett 85] 

[Pallett 86] 

[Pay 81] 

[Pearkins 84] 
[Peckham 83] 
[Peckman 86] 
[Pfauth 83] 

[Philip 87] 

[Pierrel 87] 
[Pister-Bourjot 87] 
[Pluhar 83] 

[Poock 80] 

[Poock 81-1] 
[Poock 81-2] 
[Poock 83-1] 



[Poock 83-2] 
[Poock 83-3] 
[Poock 83-4] 
[Poock 83-5] 
[Poock 83-6] 
[Poock 83-7] 
[Poock 84] 
[Poock 85] 
[Poock 86] 
[Prasad 87] 
[Pursley 85] 
[Quarmby 86] 
[Reardon 87] 
[Rehsoft 84] 
[Reuhkala 83] 
[Rigoll 84] 
[Rigsby 82] 
[Roberts 86] 
[Rollins 83] 
[Rollins 85] 
[Ross 84] 

[Rossi 83] 
[Saitta 83] 
[Salfer 85] 
[Santarelli 84] 
[Scagliola 83-1] 
[Scagliola 83-2] 
[Scagliola 84] 
[Schalk 82] 
[Schalk 83] 
[Schmandt 85] 
[Schotola 84] 
[Scott 83] 
[Seaman 82] 
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[Seaman 83] 


[Taggart 81] 


[Wetterlind 86] 


[Seaman 85] 


[Tanaka 83] 


[White 84] 


[Senensieb 84] 


[Taylor 86] 


[Williams 85] 


[Shapiro 84] 


[Tecosky 86] 


[Wilson 84] 


[Shapiro 85] 


[Teja 83] 


[Withers 83] 


[Shore 83] 


[Thompson 84] 


[Wood 86] 


[Silverman 85] 


[Thompson 85] 


[Woods 85] 


[Siroux 85] 


[Tyler 86] 


[Wyatt 85] 


[Smith 83] 


[Underwood 84] 


[Yalabik 84] 


[Smith 84] 


[Viglione 84] 


[Yannakoudakis 85] 


[Spine 84] 


[Viglione 86] 


[Yellen 83] 
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41 1 Waverley Oaks Rd. 

Waltham, Massachusetts 02154-8465 
(617)893-5151 

KVS fKurzweil Voices vstemsl: Voice Input (for IBM PC, XT, AT) 
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20270 Goldenrod Lane 
Germantown, Maryland 20874 
(301)428-3227 or 1(800)635-3355 
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Auburn, Washington 98002 
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Pronounce Voice Control System: Voice Input (for IBM) 

MIMIC, INC. 

P.O. Box 705 

Islington, Massachusetts 02090-0705 
(617)329-9593 

Mimic Speech Processor: VOIS (Voice Output for Industrial Systems): 
Voice Input/Output (for OEM; Microcomputer) 
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Radio & Transmission Division 
2740 Prosperity Ave. 

Fairfax, Virginia 22031 
(703)698-5540 

AR-10: Voice Input/Output (for IBM) 

DP-200; Voice Input (for RS-232C; RS-422; IEEE-48; 20MA Current loop) 
SAR-10: Voice Input/Output (for IBM) 

SR-10: Voice Input (for RS-232C) 

SR- 100; Voice Input (for RS-232C; NEC) 



PERIPHONICS CORP. 

4000 Veterans Memorial Hwy. 

Bohemia, New York 1 1716 
(516)467-0500 

TeleMarketer: Voice Input (for CDC; DG; DEC; HIS; IBM; NCR; Unisys; 
Wang; PABX; A CD) 

VoicePac Announcement System : Voice Input/Output (for CDC; DG; 
DEC; HIS; IBM; NCR; Unisys; Wang; PABX; ACD) 

SCOTT INSTRUMENTS CORP. 

1 1 1 1 Willow Springs Dr. 

Denton, Texas 76205 

(817) 387-9514 

Coretechs VET-3 Voice Entry Terminal: Voice Input/Output (for RS- 
232C) 

Shadow/VET Voice Entry Terminal: V oice Input (for Apple) 

VET-2 Voice Entry Terminal : Voice Input (for Apple) 

SHURE BROTHERS, INC. 

222 Hartrey Ave. 

Evanston, Illinois 60202-3696 
(312)866-2200 

SMI 0 Headset Microphone : Voice Input (for OEM) 

VR 230 Two Way Headset: Voice Input/Output (for OEM) 

VR300 Gooseneck Microphone: Voice Input (for OEM) 

503BG Close-Talk Microphone: Voice Input (for OEM) 

512 Two Wav Headset: Voice Input/Output (for OEM) 

SPEECH, LTD. 

3790 El Camino Real, Suite 213 
Palo Alto, California 94306 
(415)858-2207 

Protalker: Voice Input/Output (for IBM; OEM; Microcomputer) 

SPEECH SYSTEMS, INC. 

18356 Oxnard St. 

Tarzana, California 91356 

(818) 881-0885 

DS100 Phonetic Engine : Voice Input (for RS-232C) 

PE200Phonetic Engine: Voice Input (for IBM; RS232C) 
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SUDBURY SYSTEMS, INC. 

31 Union Ave. 

Sudbury, Massachusetts 01776 
(617)443-8966 or 1(800)245-7817 

RTAS: Voice Input/Output 

SUNCOAST SYSTEMS, INC. 

3100 McCormick St., 

Suite 22, P.O. Box 7105 
Pensacola, Florida 32514 
(904)478-6477 or 1(800)843-9363 

Computerfone: Voice Input/Output (for OEM) 

TECMAR, INC. 

6225 Cochran Rd. 

Solon, Ohio 44139 
(216)349-1009 

Voice Recognition Board: Voice Input (for IBM PC) 

TEXAS INSTRUMENTS, INC. 

P.O.Box 655012 
Dallas, Texas 75265 
1(800)527-3500 

Speech Command System: Voice Input/Output (for IBM; TI) 

VOICE COMPUTER TECHNOLOGIES CORP. 

5730 Oakbrook Pkwy 
Norcross, Georgia 30093-1888 
(404)441-2303 

VCT Series 2000 Model 2016: Voice Input/Output (for CDC; DG; DEC; 
HIS; IBM; NCR; Unisys; Microcomputer) 

THE VOICE CONNECTION 
17835 Sky Park Circle, Suite C 
Irvine, California 92714 
(714)261-2366 

IntroVoice I: Voice Input (for Apple II, Apple He; RS-232C) 

Intro Voice II: Voice Input (for Apple) 

IntroVoice III: Voice Input (for IBM PC, XT, AT) 
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Intro Voice V: Voice Input (for IBM; Compaq 386) 

IntroVoice VI: Voice Input (for IBM PS/2, PC, XT, AT; Compaq 386) 
PVDL (Portable Voice Data Logger): Voice Input/Output (for IBM) 
VMC 2020: Voice Input (for Apple n, Be) 

VOICE INDUSTRIES CORP. (VERBEX) 

10 Madison Ave. 

Morristown, New Jersey 07960 
(201)267-7505 

Series 4000: 5000 : Voice Input (for RS-232C) 

VOTAN 

4487 Technology Dr. 

Fremont, California 94538 
(415)490-7600 

Voice Management System: Voice Input/Output (for RS-232C; 
Centronics parallel) 

Votan Voice Card (Board Level): Voice Input/Output (for IBM) 

VSP 1000 fBoard Level: Voice Input/Output (for IEEE-786) 

VTR 3270: Voice Input/Output (for IBM; Coax 
VTR-6050 Series II: Voice Input/Output (for RS-232C) 

VYNET CORP. 

180 Knowles Dr. 

Los Gatos, California 95030 
(408)370-0555; (408)370-9764; or 
1(800)538-7002 

V2100 Telephone Voice Response System: Voice Input/Output (for IBM) 
V2301/V1202/V2202 Telephone Speech Digitizer & Playback: 

Voice Input/Output (for IBM) 

V4000 Telephone Voice Response System: Voice Input/Output (for IBM) 

XTRA BUSINESS SYSTEMS 
2350 Qume Dr. 

San Jose, California 95131 
(408)945-8950 

Voice Communications System: Voice Input/Output (for XTRA Series) 
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